UNCLASSIFIED 


ad  413321 


DEFENSE  DOCUMENTATION  CENTER 

FOR 

SCIENTIFIC  AND  TECHNICAL  INFORMATION 

CAMERON  STATION  ALEXANDRIA.  VIRGINIA 


UNCLASSIFIED 


DCCHE3C:  Wue e  gerrermnncmt  nr  otiber  drawings,  sped- 
ficsdans  nr  ntier  data  are  used  for  as Ey  pmrpose 

Otter  tftnam  Urn  r.nrrm^rltrfl  rrTr,  dtSt  a  deffiLdtdy  rwlmitrcS 

^rrrumTi  im^r  t:  jrm— n  'i  wmi  .t,  nperadan,  tie  U.  So 

ScTcmangEit  tieredy  Ixesrs  mm  respondM!  flty,  nr  any 
o&Ugrtion  sfeatsoexer;  and  'tie  ©act  tiat  tie  jteittt- 
aemrt  may  iere  formal  atted,  ftonaJ  nfced,  nr  im  assy  way 
sap^uied  tie  sail  drawings,  spedforatizcLs,  nr  ©tier 
date  Is  met  dr  tie  regarded  iy  flmglt  nadtoa  nr  stier- 
wlse  as  1 m  azy  mamcr  IdrernsrSag  die  adder  or  azy 
ctier  person  or  nnipczatiatt,  cr  mreying  any  rtgjbds 
©T  ffIJj  dr  marraftodsre,  ase  nr  sell  azy 

patented  iasrentlon  diet  aay  In  azy  way  fce  related 
tieredr. 


TECHNICAL  DOCUMENTARY  REPORT  ASDTDR^3-I5S 


<M 

CO 

CO 


JUNE  1963 


AF  AVIONICS  LABORATORY 
ELECTRONIC  TECHNOLOGY  DIVISION 
‘  AERONAUTICAL  SYSTEMS  DIVISION 
AIR  FORCE  SYSTEMS  COMMAND 
VVRICHT-PATTERSON  AIR  FORCE  BASE.  OHIO 


Prvjrve  V  -Mil,  1\*A  V>.  ?**&> 


// 

/ 


NO 


FVjwid  umS*r  Contract  AF  33K6I6V6986  by 
/A"|  S'lhMu.  Iik  (’iikbiki  *uh?  Control  Svctenw  Ibvwion 

^  VYooJLmmI  HilK  Culrtomu  \utbr  II  Riiihmol1 


NOTICES 


When  Government  drawings,  specifications,  or  other  data  are  used  for  any 
purpose  other  than  in  connection  with  a  definitely  related  Government  procure¬ 
ment  operation,  the  United  States  Government  thereby  incurs  no  responsibility 
nor  any  obligation  whatsoever;  and  the  fact  that  the  Government  may  have 
formulated,  furnished,  or  in  any  way  supplied  the  said  drawings,  specifications, 
or  other  data,  is  not  to  be  regarded  by  implication  or  otherwise  as  in  any 
manner  licensing  the  holder  or  any  other  person  or  corporation,  or  conveying 
any  rights  or  permission  to  manufacture,  use,  or  sell  any  patented  invention 
that  may  in  any  way  be  related  thereto. 


ASTI  A  release  to  OTS  not  authorized. 


Qualified  requesters  may  obtain  copies  of  this  report  from  the  Armed 
Services  Technical  Information  Agency,  (A5TIA),  Arlington  Hall  Station, 
Arlington  12,  Virginia. 


Copies  of  this  report  should  not  be  returned  to  Hie  Aeronautical  Systems 
Division  unless  return  is  required  by  security  considerations ,  contractual 
obligations,  or  notice  on  a  specific  document. 


FOREWORD 


A' 

This  report  was  prep aied  by  Litton  Systems,  Inc.  ,  Guidance  and  Control 
Systems  Division,  Woodland  Hills,  California,  under  Air  Force  contract 
AF  33  (i>  l6)~b93b;  task  No.  509 20;  project  No.  0  (b  13)-442 1;  titled'  **De velop- 
ment  of  An  Airborne  High  Speed  Digital  Differential  Analyser*1 . 

The  studies  presented  herein  were  begun  in  March,  19b JO,  and  the  initial 
phase  of  the  program  was  completed  in  May,  19561.  The  final  phase  of  the 
program  was  begun  in  July,  19b  1,  and  was  completed  in  JNovember,  1962 . 

Mr.  H.  BanbroDk  was  the  principal  investigator  in  the  program  and  per¬ 
formed  the  bulk  of  the  analytical  studies .  Mr .  Banbrook  also  directed  the 
technical  efforts  of  supporting  contributors  to  the  program  who,  withfhair 
respective  areas  of  interest,  are  listed  below . 

Equipment  Program:  D.  R.  Roth,  Project  Engineer 

F.  Rosenbloom,  .Logical  Design 
Logical  Design  of  The  DD*A:  A.  Whitehorn 

Dr.  G.  Matthews 

This  report  is  a  final  report  fox  the  program,  and  concludes  the  effort 
under  this  contract.  The  contractor's  report  number  is  2JD5-27. 

The  Air  Force  Project  Engineer  for  this  program  was  Captain  Aubrey 
Calton,  of  the  ASD  Electronic  Technology  Laboratory. 


I  he  cconteaat  meaiiited  iiri  me.w  anti  jpoweiffiil  gwroneagiigg  amd 

i m ec  1  lanizati d n  ttetihriiguefi  ttirat  1-hawe  lhaen  liitfesgiratlBti  iin  tthe  rtietpgn 
of  si  ffvill  scale  ^rqg.rarnrn-afcile  .an uiti-a-ricmermBTCt  crmrnputer  (iar  fhM 

'M 

aerospace  rmissions .  .Analy-ses  oxf  tthe  ££untiarn®tt(^Ily  oSi^acraCte  {pro¬ 
cessing  ®e cj  uia>  em e  nfte  cdf  leststmfiisil  aerrafpHQe  a  ihroutines  cfterrani- 
stmatietf  :6hat  rnatiicaily  me.w  di^gittU  fcmrrp’tter  (design  ttuxihrriigue*  and 
more  (efficient  (computer  mvatihamzaficns  'went  msguiateti.  Mertian- 
ications  wese  (optimized  ££or  ttheVhaair  ttypes  cdf  cnxjrrgiittattiimaEs,  tfihen 
integrated  main®  motitil  ctieeign  dtimwpt*  ttD  atibiette  a  sirrgJiiffieti 
B.y-a^em  rtirrough  mew  ttume-^haaiing  ttedhnigues.  Ihiitiiiil  tbasic  {theo¬ 
retical  analyses  ffurmefl  tthe  ffnimdattimiB  ffor  tthe  irttegrUtiwe  <d5f  ictts . 
.'Numeracal  Sattidltjes  iinfcggrritiiDn  adgoteitlima  .went-  ctteriMsti  mtd  rfkwfillqpsrti 
tier  input  jpriroesaing  anti  tirtteoanBll  carrmp utatimm .  IFamtiameitfciil  (digital 
.(Stieltjes  rrilgari.thms  f  attained  mew  Ilewrils  off  accuracy  iin  aiigglle  aid 
.  m lilti-ii  n  ere  m  e nt  cCOTTrputers.  Unwarttion  cdf  aeramH  dfiflEfereno®  cmrm- 
putation  and  c  cum  muni  catin  n  Had  ttD  tthe  {first  jgenesrail  ((ciurctimttj)  allgarithni 
viith  rm  i  ilti—incT'erntfi  i  tt  accunmcy  and  a  rm  ulti-tfcrai  idiesr  aniit  wtiiidh,  iin 
xafies ,  (equals  fpneorsirm  off  ccDnvfi ittin iral  edevi  rme  red  ttwiue  tthe  (ccmqiletvity. . 
in  .meeting  cdsm-antiing  merDHpaDe  arnguiawmertts ,  aeaiiaBU-pSBaallliell  anufit- 
imertic  imodeil  tie chruq uee  \weare  (developed  -.wiiidii  ®mdiie  a  mesa-  cnmttimmus 
maximization  afi  anitfhmfttic  ccapribiility  finr  rdispEmtitse  routines  (rtftfhe 
lull  .program,  Heading  ttD  idtamwBBHd  cncrnjn.tt.ea'  cirrngile:di£y  £fnr  tthe 
reguirnd  computation  capability.. 


mwjJSkmM  mmm 


TIHdi»  teoliniiaaE  dtojumraitlaoy  megaartt  Hbarn  teem  bs wiiewefi  anril  fi*  sggsmwsdl.. 

H® ! 


liUjMO-M- 

'istmwL  sl.  mam 

j&cttting  a&msff,,  IBiwaiikE®  Sto»mlfa 
saJexcttaam&E  TCrafirasilkigy  IHiwltetom 


TABLE  OF  CONTENTS 


Chapter  P«ge 

Abstract .......... . .  .  .  .. . . .  ti 

1  1Tnnlt-inimiHiii^lri;<i«m  ..............................  I—  1 


JLE  The  Strap-Down  Processor  Coostmctei  to  Demon* t  ra  te 

ffrafmmt  Processing  Principle*  sod  Special  Purpose 
MW^Kiauirria’aftT.irMim,  far  Aerospace  Applications . II-  1 

nun  SwnaillylH''<rair|  IVw I; npm  Duxxng  f%aSC  11  in  tile 

General  Theory  of  Beal  Time  Compote  r  . Ill- 1 

IV  Design  Concepts  for  pouter  Systems  tsiafc  Pro¬ 

grammable  lrmputf  Processing  Capability  ...........  IV  —  i 

V  Evaiuatxaa.  of  Anacliary  DBA  Design  and  Computation 

Features . . V- 1 

VI  Development  a£  Senal-ParaLLel  DBA  Mechanisations 
with.  H.gb  Dotty  Factors  which  are  capable  of  Quotient 
Algorithm  w ith  Derived  tS.ngle  Increment)  Ternary 

and  Later  Developed  Wain  -Increment  Computation.  .  .  .  VI- 1 

VTX  New  Concepts  of  Multi- Increment  Computation  and 

Development  of  a  Wulti -Increment  General  (Oaotient) 

Algorithm  Computer  with  Second  Difference  Ottputi 

having  Simplified  rM«n»m«tfarinn  . . Vn.  i 

VIM  Type  of  Programmable  Modal  Action  of  the  Full  Scale 
Incremental  Com  pater  implied  by  Computation  Task 
and  Mechanisation  Factors . V III- 1 

IX  DBA  and  QDD^A  Simulations  on  the  IBM  704  Computer  IX- 1 

and  Primary  Results . 


V 


TABLE  OF  CONTENTS  (Contmoed) 


Chapter  Page 

X  Quantitative  Evaluation  of  Ccgnputatiom  anc  Rate 
Handling  Capability  of  BOA  and  QJDDA 

Mechanizations . .  X-l 

XI  Multi  -  Inc  reznent  QDD~  A  Programs  for  Important 
Application*  and  Programming  Efficiency 

Evaluation . -  -  - . XI- 1 

XU  Programmable  Transfer  Operations  ot  the  QDPU  and 

QDPU  Programming  Code  Studies . ------ . XH_  1 

XIII  Logical  Design  Investigation*  of  Second  Datterence 

Incremental  Computers  with  Conventional  and  General 
(Quotient)  Algorithm . X223-3 

X3V  The  Future  Role  of  the  Incremental  Computer  •am  Full 
Scale  Aerospace  Computer  Systems  and  Proposed 
Study  E-forts . XIV-  3 

XV  Brief  Summary  of  Accomplishments  of  the  HSIMJA 

Study  Effort . XV  -  3 


vi 


LIST  OF  lULILFSTMAlIO^S 


2-H  MiambOTal  Fquuiig«nniemtt  Stara.p- IfewTffi  (Dommpwuttcr  II -2$ 

2—2  ffrmgDltpff  tOteditt  HI  — 3© 

2—3  lTTm|pmrtf  Acoaaamllattoqr  Unniitt  wt(a,M.Ma,MMa,MMS,Ma,Ha,WM4.M  HI—  311 

2—4  TTTrnfflrrnrTrfm^tt^fTTrn  JTliSW  fllffi  Imignmilt  (rommim  Bafttiyr  Utaltt  +  m  11—33 

2-5  Ow&pw t  (Off  FAR  ©3  ^ . . . . .  M-33 

2— &  MimHfliipIlii *»tt  Oiiiiiitt  HI— 34 

2-7  MuaDitipIlicar  Oouttpwiitt  •  11—34 

2—®  1MTmTlfrn|pDTi<r^friifnrTi  Sctt&CSfiailll^  M-35 

2-f  TF~ WffTTJ  jprfll  Ha  ftvnvff-  iriTWITilt  H— 37 

2—1®  Omffjjwmff-  ttUmailt  ................a............  M— 3*® 

2-11  Initftnannmaltffiaimi  Flapw  Isa  Qcattpiait  Unisitt  .................  II -4® 

2—12  Qmmnm  1T.Siim>  IngsuaS;*  .........a...............  M— 41 

2-13  Oa%xmlt  OrjganaiiattliEffli  ........................  11-42 

2—14  ffln.lt  CcBiBHDttejr  .............................  11—44 

2-13  Ward  Owamitter  ...........................  11-43 

2-1  Hr  Innsttruacttami  Regusttes-  ........................  HU— 4ttr 

2-1  T  Ouaitgcustt  Circle  ............................  H-31 

2-13  Reanfi-jr  ff®ff  iMttiraielfcd®®  .......................  11-33 

2-1®  Fill  C®a-e  . .  11-34 

2-2®  Sena!  Draan»  Oaaaaedtiaan*  fair  Hknasms  Fill  ..........  H-3& 

2-21  Fall  Bnanan . 11-39 

2-22  Sett  A«Ma-ess  . H-fc® 

2—23  OManpaait©.®  m-ti' 

2-24  fmjpinatt  HI -^2 

2-25*  ff^n-marrm  Fall  InjffoTnnAti®*  mmmmmmmmmmmmmm-mmmmrnm 

2-2fe  T»f>e  mmmmmmmmmmmmmmmmmmmmmmmrnmm»  mm  1I~^4 


VU 


LIST  OF  ILLUSTRATIONS  (Continued) 


Figure  Page 

2-27  Tmipniift  Data  Organization  .  H-fe5 

2 -2ft  Output  Tape  HI  — fcfe 

2-29  Output  Tape  .............................  H-foT 

2-1©  Laboratory  Model  of  the  HSDDA  Computer  .........  H-ftfi 

2—31  Mammal  Control  Unit  for  the  HSDDA  Computer  .......  U— &9 

2-32  Information  Processing  for  Evaluation  of 

High  Speed  DDA  Hardware  ..................  H-7Q 

2-33  General  Flow  Chart  ........................  H-ftl 

2-34  Central  Differences  for  A.U*  .................  H-Sft 

jk 

h-L  Register  Array  for  Diversion;  in  One  Word  Time  ......  VI-14 

fo-2  QDPU  Program  Diagram  for  Updating  *,  y.  and  z 

with  1©©  Percent  Efficiency  ..................  YI-19 

fo-3  QDDA  Program  Diagram  for  Toss  Bombing  ........  Vl-21 

9-1  Schematic  of  Simulated  DDA  Reciprocal 

Calculation  ............................  IX- 12 

9—2  Simulated  Con  v  enti  ana  I  DDA  Calculation  of 

Reciprocal  Using  Am  Me'  a  Method.  Integration 

Algorithm  Classically  Mechanized  .............  DC-14 

9-3  Alternative  Reciprocal  Calculations  .............  DC- 17 

9-4  Servoed  Am  hie' s  Method  .....................  IX- 18 

9-5  Iteration  Number  (n*  )  Preceding  Non-zero  X  .....  DC-22 

BL  Et 

11-1  QDPU  Processing  Schematic  ..................  XI- 12 

11-2  Missile  Velocity  Calculation  With  Scale  Factor; 

Linear  Drift  and  Bias  Error  Correction  of 

Inputs  and  Output  of  the  Digital  Computer  ........  XI- It 

11-3  Complete  Geographic  Coordinate  Calculator!.*  By  the 
Proposed  QDPU  Multi- Inc rement*  Without  Double 
Integration  Mode .  XI-2  5 


VUl 


LIST  OF  ILLUSTRATIONS  (Continued) 


Figure  Page 

11-4  Geographic  Coordinates  Calculation  by  the 
QDPU  (Multi -Increment  and  Assuming  a 
Double  Integration  Mode  Not  Incorporated 

in  Final  Design)  .......................  XI  -27 

11-5  Angular  Rate  Computation  Program  for  the 

QDDA  .  XI-33 

1 1  -fe  Gravity  Computation  Program  for  the 

QDDA  .  XI-3b 

11-7  QDPU  Program  for  Thrust  Cut-Off  by  Multi- 

Iteration  Rate  Computations  of  Input  Processing 

and  Internal  Computation  Routines  . Xl-45 

11-8  St  rap- Down  Computation  QDPU  Program 

(Craft  Orientation  in  Inertial  Space) .  XI -49 

11-9  Strap-Down  Computation  QDPU  Program  Type 

for  3  of  5  QDPU .  XI-30 

11-10  Strap-Down  Computation  QDPU  Program 

for  2  of  5  QDPU .  XI-5I 

11-11  Schematic  of  the  QDPU  Program  for  Pitch 

Command  IXiring  Re-Entry  .  XI -is 

11-12  Doppler  Damping  Without  GP  Supervision 

(Mu It -Increment  QDPU  Using  Decision  Mode)  ....  XI-ea 

11- 13  Doppler  Damping  Without  GP  Supervision 

(Multi -Increment  QDPU  Using  Decision  Mode)  ....  XI-7C1 

12- 1  QDPU  Register  Labeling .  XU  -  3 

13- 1  Arithmetic  Mechanisation  of  the  DD^A 

Integrator  . .  XIII -4 

13-2  Integrator  With  the  X  Register  Added  .  XIII -3 

13-3  Integrator  Word  Structure . .  XIII  -  6 


tx 


LIST  OF  ILLUSTRATIONS  (Continued) 


Figure  Page 

13-4  Final  Form  of  the  Integrator  Including  an 

Associated  Marking  Channel  . XM~9 

2 

13-5  OD  A  Memory  Structure  ...........  .......  XIII- 10 

2 

13-6  PDD  A  Output  of  the  Second  Differential 

Product  of  Xy  .  XIH-22 

13-7  Elementary  QDD^A  .....................  Xiil-23 


x 


CHAPTER  i 


INTRODUCTION 


The  Stanly  program  alt  tine  conclusion  of  the  second  phase  of  a  two-year  effort 
respited  aim  development  of  markedly  powerful  processing  and  mechani ration 
techniques  which  have  beesa  integrated  in  the  design  of  a  hall  scale  program¬ 
mable  incremental  computer  system  for  foil  aerospace  mission.  Especially 
ti eadt  yaitii  for  a  computer  has  been  derived  which  meets  compu¬ 

tation  requirements  for  an  aerospace  sntsaion  from  boost  through  coast  to 
apogee,  injection  into  orbit,  retro  fire,  transfer  orbat,  re-entry,  maneuvering 
The  design  level  also  provides  for  sophisticated  auxiliary  func¬ 
tions  of  mailata missions  f h  assay  be  ^ m p *  <r  a 'm.ti thi hb  5  years.  It  was  deaas— 
omst  rated  analytically  and  ti hb  siassaalations  ttftant  the  breadth  of  c^imputtatt *'0^  crpac  — 
ity  required  by  the  large  set  of  cosnputation  routines,  of  sail 

disparate  aaatnres.  is  not  met  by  existing  real  time  compote r  systems  of  ac¬ 
ceptable  weight  and  size.  To  fulfill  the  goals  of  the  contract  effort,  the 
following  emb-imvestigatioms  were  carried  out  during  Phase  H  Kin  considerable 
over-lapping  time  sequence!:  <li  applications  surveys  and  evaluations  were 
jsi^sde  of  the  cotuputat ntfvwi  requireasae^ita  of  the  su^irou^imgs  for  these  various 
applications:  Hi))  a  fundamental  daaatfi cation  of  computation  types  was  ear¬ 
ned  out;  |S!  study  of  system  implications  with  respect  to  overall  computation 
capability  and  hardware  requirements  was  completed:  «4);  improved  digital 
iput techniques  including  new  arithmetic  modal  design  concepts  and 
special  basic  transfer  units  were  developed  for  hardware  economy:  Son  indi¬ 
vidually  optimum  ■— mu for  each  computation  type  were  integrated 
a.  e. .  arithmetic  modal  design  techniques  were  developed  for  time  shared 
minimized  mechanization  to  form  a  single  versatile  mechanization,  for  the 
required  new  level  of  computation  capability  in  a  computer  of  moderate  hard¬ 
ware  complexatv 


Manuscript  released  by  the  anchor  on  30  April.  1003,  far  publication  as  an 
ASD  Technical  Documentary  Report. 


Plate  I  lad  consisted  of  a  brood  study  ranging  from  the  analysis  of  the 
basic  natnre  of  Use  most  demanding  airborne  guidance  and  control  comps* 
tation  problems,  and  the  general  computation  approaches  to  their  solution, 
to  the  development  of  specific  high  performance  incremental  computation 
algorithms.  To  show  the  computer  design  techniques  and  evaluate  a  theory 
of  numerical  Stseltjes  integration  that  was  developed,  a  special  purpose 
computer  for  strap-down  computations  was  designed  daring  Phase  1  and 
later  constructed  during  Phase  U. 

The  general  approach  to  full  scale  computer  system  design  for  aerospace 
and  airborne  applications,  of  which  the  strap- down  processor  exemplified 
but  a  major  part,  was  seen  to  offer  a  potentially  groat  step  in  computation 
capability  for  a  foil  scale  system  of  given  level  of  mec  ham  ration  and 
complexity,  provided  major  developments  in  digital  computer  design  tech¬ 
niques  exploited  them  in  fall  measure.  The  well  defined  goal  of  Phase  Q 
made  possible  **  intense  concentration  an  the  challe^pii^  problems 

involved  and  assisted  in  development  of  the  design  of  a  full  scale  incremen¬ 
tal  computer  system  which  provides  a  remarkably  higher  level  of  computa¬ 
tion  capability  for  given  hardware  costs  than  existing  computers  constructed 
from  die  same  state  at  dm  art  hardware.  Design  techniques  were  developed 
for  extracting  consistently  in  real  time  greater  real  computation  value. 

The  basic  classes  of  computations  which  comprise  airborne  and  aerospace 
comp  oration  programs .  introduced  as  input  processing  and  internal  compu¬ 
tation  in  the  analyses  of  Phase  1.  are  provided  for  in  a  highly  integrated 
programmable  incremental  computer  evolved  during  Phase  H.  Further 
detailed  analyses  ihniriag  Phase  B  revealed  that  certain  addin  nest  require¬ 
ments  (to  be  described).  are  not  met  in  existing  incremental  compilers, 
boh  are  needed  m  a  fall  aerospace  mission  computer.  However,  these 
asd  previously  recognised  repair— enOs  are  met  m  highly  efficient  design 
match  unfixes  the  idianr ed  algorithm  and  digital  processing  techniques 


Analytical  efforts  during  Phase  II  in  the  general  theory  of  numerical 
incremental  computation,  served  to  complete  the  establishment  of  the 
general  relationships  of  the  processing  concepts,  with  development  of 
simplified  algorithms  for  internal  computation  in  terms  of  wnat  is  referred 
to  as  " virtuaf"  variables. 

Computation  requirements  and  capability  analyses  for  a  foil  aerospace 
mission  demonstrated  that  for  internal  computations  the  computer  with 
the  time  shared  arithmetic  module,  requires  {))  several  hit  increment 
computation,  U)  precisian  general  (quotient)  integration  algorithm  and 
(1)  a  degree  ot  parallel  processing  capability.  These  sophistications  are 
shown  realizable  without  significant  additional  cost,  assuming  a  programmable 
input  processing  capability,  becanac  of  the  concept  of  a  single  arithmetic 
module,  that  is  simpler  than  a  whole-word  fast  multiplier  and  that  nses 
time  sharing  and  modal  switching  in  the  enecutioa  of  all  the  system  functions 
each  of  which  is  at  an  individually  required  level  of  precisian.  In  the  in¬ 
ternal  c amputation  design  studies  a  major  deficiency  of  the  conventional 
OOA .  but  a  partially  developed  attribute  of  at  least  one  of  the  more  recent 
incremental  computers,  was  fully  developed  upon  discovery  of  digital  pro¬ 
cessing  techniques  capable  for  the  first  tune  of  division  with  multi-increment 
accuracy  in  an  incremental  computer  (with  no  direct  division  capability). 
Because  of  high  variable  rate  data  and  precision  requirements,  division 
algorithm  is  amenable  to  accurate  floppier  damping  in  conventional  navigation, 
in  coordinate  transformation  creiipnttt  ions.  such  as  toss  bombing  and  fire 
control,  ana.  most  important,  in  the  generally  demanding  aerospace 
c  uen  putati  ons . 

The  multi-increment  computer  is  almost  universally  considered  inherently 
to  have  a  more  complex  communication  structure ,  but  new  techniques 
based  on  the  theory  ot  information  for  band  limited  variables  led  to  a  com¬ 
puter  structure  in  which  communication  coats  are  less  than  in  I  conventional 


1-3 


single  increment  DQA  with  toe  same  integ rato r  count.  Adaltl -transfer  unit 
cost*  ir-c  reducible  to  in  entirely  acceptable  proportion  of  computer  system 
c  o*te .  For  appda, carta oaa  demanding  both  input  processing;,,  as  wwelUI  as  inter- 
ml  capability*  a  time  shared  arithmetic  module  ***'**  saf 

parallel  ((high  ratej  several  bit  increment  and  serial  ({jnfleraroedaaite  rate)) 
many  tut  increment  computation  was  evolved-  Another  remarkable  ammlta- 
transier  -w*™*  was  developed  tor  internal  ;o amputation  by  applying  toe  new 
.design  lechm-que*..  The  mew  snanltdpilier  is  capable  <s&  Id  bit  transfer,,  with 

(comparable  to  a  *fti»'**  ifamt-  transfer  mwiiit^  where,,  dor  emmple,, 

Ad  may  be  as  large  as  Id)  fan**  iflepeiiyBtimg  an  it  Bn*  m-a  itrirm^r  riirmplrienB  by  itBne  analytic 
or  empirically  determined  character  off  toe  computation  -variable.. 

The  development  (during  phase  1  <ttf  this  contract  study  oft  a  ideengn  technique 
tor  general  t(qumtieiUt|  algorithm  o  amputation  -wito  mmilltti-inrremaent  accuracy 
bad  not  been  met  tor  (ramaiteingirng  technical  reasons..  The  solution  during 
Phase  U  <nf  this  problem  tor  hand-limited  variables  -was  accompli  slhrd  together 

with  iniMHWtiimii  rff  sjir  runifl  idiSfatwiiyw  irMUMipm*ltiym«  rim  what  is 

believed  to  be  a  sign mfii rant  contribution  to  toe  field  oaf  digital  computation. 

In  a^iitimi  to  overcoming  tfai«  Smmi gr,  problem  toe  directly 

imply  remarkable  simplifications  in  general  multa-increment  computer 


Integration  of  the  many  new  processing  techniques  into  a  single  system 
was  acc  graph  shed  in  a  natural  and  highly  eJfacaent  manner .  The  result 
was  a  design  structure  of  an  incremental  computer  od  modest  weight  and 
vofaune,  assuming  use  at  state  ot  the  art  hardware,,  and  modest  dock  sate, 
which  can  csKUte  the  computations  on  comtanuuus  variables  ((and  with  pro¬ 
posed  design  developments  ,  piecewise  conti  nuous  variables)  involved  in  toe 
many  tasks  including  thrust  cntett.  strap-down  oampntataon,  guidance  and 
control  in  all  phases  at  the  aerospace  mission  amcaudrag  re-entry  with  sophas- 
ticated  energy  anaaagssnsnt  asst  all  at  the  new  levels  otf  accuracy  required. 


1-4 


CHAPTER  EH 


THE  STRAP-BOWNT  PROCESSOR  COGiSTREJCTE©  TO  DEMONSTRATE 

isnPEnr  processusc  prbjopiles  amb  sfeoal 

PttURFOSE  MEOflANTZ  ATJON5  FOR 
AEROSPACE  APPlUCATHOOfiS 


2.  (3D  UNTTRODTCTHONI  —  Rfgmt  processing  principles  developed  dnaria g  pfaase  I 
caansistted  ad  primcigriles  tor  Sttiellttjjes  nanunnerical  integrationm.  nmroltti— nmucrernmeartt, 

anmfl  wffiMntri — ntta»yTJHfTWnm  irawto  ir.mrtw^pmtjilHvmnn  arf  iwrtf^yraiH  iuDcmemSCnttS-  TBw  strap— dapwn 

processor  was  caansttiractted  daring  Fftase  El  to  d^mrarMiytTmite  ttfte  itniijfln  Bevel  ad 


accuracy  attttaiimaMe  toy  applacattnoaB  off  toose  principles  to  a 


,  gwra^— 


Mbeanui  witifib  teo^h  enw  scansittixwiitt^g  suudfo  as  ttftnttt  <xsff  sttirsp^j&Dwwii 


Ot  m.«  proposed  toart  ana  tttoe  toasiia  ad  toese  and  enpeetted  finrtoer  deweEmptmeratts. 


in  processing  and  nmectoaamaaftiiipii  principles.  toatt  a  fall  scale  design  toe  developed 
tor  a  eoonnpntter  capable  ad  eneenfling  all  aerospace  cnniimpiiilattnon  tasks  ad  a  full 

a  package  ad  modest  si*e  and  meigpft-  Thiis 
chapter  presents  canmtractt  stody  results  mtosdw  tatter  ttaa  a  direct  general 
stage  ad  toe  prinmarw  efftortt.  are  toe  results  ad  special  analyses,  harden  re 


design  and  ewaHinatiorn  associated  »ito  toe  sttrap-dom*  processor  and  special 
purpose  strap- down  cmmmpmtcr  design ..  The  strap-damn  processor  is  a  sgaecial 

nft  £fOBD  pfOCC9SIB|  pfUBCOplcSii  OQCOf* 

parates  special  purpose  owmipmcr  desfigp  prinsrnp&es-  WfcaHe  the  primary 
cannnctt  sttufy  effort  mas  devoted  ttoa  toe  deweloptnesB  off  a  fall  scale  pro- 
gramnamatoSe  coonpnter  system,  it  mas  newer  toeless  consult r«d  pefiatoto  to 
evaluate  toe  aonplicatMaa  off  toe  tHiip  dome  pe oc eases  as  a  design  toasts  tor 


^ .  sad  as  pre-amid  coarse  mrnissile  guidance.  are  regarded  tov  mmany 
systteum  analysts  as  ^»M™»g  tor  a  strap-dome  cotngater.  a  special  psrpoMe  coem- 
pmter  to  toe  placed  in  an  inftertBmcdiatte  tnissile  lanacto  stage  package- 


For  such  applications  the  strap-down  comm  punter,  lor  a  system  using  state  of 
the  art  sensors  and  transducers,  is  tailored  to  meet  a  given  precision  re  - 
quirement.  This  presents  tine  relatively  straightforward  design  task  of  modi¬ 
fying  specific  parts  of  the  strap-down  processor  to  adjust  multi-increment  bit 
lengtb,  algorithm  sophistication,  and  input  accumulation  for  the  transducer 
type  to  meet  the  generally  less  demanding  levels  of  those  applications, 

l.  1  STRAP-DOWN  PROCESSOR  DESIGN  MODIFICATIONS  DURING  PHASE 

HI  AND  ANALYSIS  OF  RESULTANT  COMPUTATION  CHARACTERISTICS. 

A  Introduction  -  Certain  design  modifications  to  tide  preliminary  de¬ 
sign  of  Phase  1  were  made  dur.ng  Phase  H  to  ootain  a  special  purpose 
computer  capable  of  not  only  Hugh  precision,  but  one  whose  design 
could  be  readily  taJlore  •  to  possible  special  purpose  applications. 

B  Strap-Down  Processor  Design  Modifications  -  The  input  processor 
logical  design  problem,  involved  ,n  the  sequencing  of  component 
calculations  of  the  angular  and  inertial  velocity  requirements,  was 
reviewed  to  establish  a  process  which  achieves  total  updat.ng 
witihcn  a  single  slow  iteration  interval.  The  modified  log.cal  de¬ 
sign,  adopted  and  described  in  the  section  on  log.cal  des.gn, 
achieves  a  sequencing  with  s -triple r  input  processor  mechanization 
and  checkouts,  and  has  an  output  variable  set  more  naturally  as¬ 
similated  by  an  internal  computer  or  output  device  A  second 
major  modification  is  the  shortened  bit  length  of  the  mult.pl.er 
unit  CO'  191  bits  gained  by  intro'  uction  of  a  roundoff  ope  rat. on  on 
input  accumulator  outputs  to  the  multiplier.  Simple  roundoff 
based  on  the  value  of  the  £Qeh  b.t  is  shown  :n  a  later  sectioa  to 
introduce  bias  error  significant  with  respect  to  processor  output 
accuracy,  the  effects  ot  which  can  be  removed  in  the  following 
alternative  mod.tications: 


B-2 


1.  Biasing  of  analog  input  reference  voltage  level. 

2.  Logical  design  A  a  somewhat  sophisticated  roundoff 
process. 

The  latter  was  chosen  in  the  final  logical  design. 

2.  2  SCALING  PROPERTIES  OF  THE  STRAP- DOWN  PROCESSOR  AND 
MINOR  HARDWARE  MODIFICATIONS  FOR  UNUSUAL  APPLICATIONS  - 
The  basic  design  features  of  the  High  Speed  Digital  Differential  Analyser 
(HSDDA)  for  strap-down  computations  are  such  that,  essentially,  any  real 
strap-down  computation  application  can  be  handled  provided  that  in  certain 
cases,  minor  hardware  modification  is  made.  The  rate  handling  capability 
is  the  result  of  the  whole  word  input  feature.  The  minor  modification  wiuch 
in  certain  applications  may  be  necessitated  is  the  result  of  non-programmable 
internal  scaling,  specifically  in  the  phasing  of  updating  additions  determining 
magnitude  of  increments  of  direction  cosines.  The  modification  which  easily 
attains  any  conceivably  desired  rate  handling  capability  is  a  delay  of  extra- 
polator  output  by  d  bit  times  to  attain  a  2d  increase  in  rate  handling  capability. 
Anticipated  future  applcat-ons  do  not  require  more  than  a  2  bit  time  delay. 

Consider  now  the  details  of  internal  scaling  of  tne  strap-down  processor. 

The  20  bit  input  words  which  are  angular  increments  and  velocity  increments 
are  accumulated  9  times  m  pre-processing.  The  immediately  subsequent  input 
extrapolation  is  mechanized  with  an  effective  increase  in  amplitude  of  8 
(the  result  of  a  3  bit  time  delay).  The  resulting  quantities  are  rounded  off 
and  effectively  decreased  by  a  factor  of  2  on  entering  the  multiplier  as 
20  bit  numbers.  The  scale  of  these  multiplier  quantities  is  seen  to  be 

9  x  8  x  2_?  =  9/lb  (II- 1) 


11-3 


relative  to  the  input  word*.  Assuming  the  Ujk  quantities  have  scale  Sjj  the 
output  of  the  multiplier  has  scale  9/16  Sy-  On  passing  through  the  extra* 
polator  the  scale  is  increased  by  a  factor  of  12.  The  outputs  of  the 

extrapolator  are  presently  added  to  the  Ujk  line  at  the  least  significant  end 
->{  the  32  bit  word.  Since  the  32  bit  word  for  Ujk  quantities  is  regarded  in 
multiplication  as  unity  for  full  register  the  latter  updating  mechanism  amounts 
to  a  scale  factor  reduction  of  2"^*  *  ^  *  2"**.  The  scaled  Ujk  quantities 
make  this  part  of  the  updating  effective  by  a  scale  factor  change  of  2“*^/Su- 
The  net  relative  magnitude  of  £Ujk  to  U  for  full  input  angular  rate  is  given  by: 

12  x  2~1Z  27  -9  (II-2) 

SL  *  32  * 


which  is  the  largest  fractional  change  in  U  which  can  be  carried  in  one  itera¬ 
tion  (without  the  minor  modification  previously  discussed).  The  maximum 
angular  rate  which  can  be  handled  without  the  proposed  change  is 


x 


~  27 

max  16(1024) 


(256  it /sec)  *  43 


rad 

sec 


(11-3) 


for  the  266  it /sec  iteration  rate  of  the  HSDDA.  The  simplest  method  of  in¬ 
creasing  ®maJC  is  to  displace  the  write  head  of  the  U  channel  the  proper 
number  of  bits  to  increase  ®max  *  factor  of  2  for  each  bit  moved  over. 

2.  3  ROUNDOFF  ERROR  GROWTH  IN  THE  STRAP-DOWN  PROCESSOR. 

A.  Introduction  -  The  final  logical  design  of  the  strap-down  pro¬ 
cessor  implies  a  computer  subject  to  roundoff  errors  at  only 
three  points  of  the  input  processing.  These  are: 


II-4 


1.  Input  conversion,  resulting  from  sampling  inputs  to  finite 
word  length. 

2.  Roundoff  of  Input  Accumulator  Oitputs,  enabling  use  of  a 
fast  multiplier  of  moderate  word  length. 

3.  Roundoff  in  the  multiplier,  which  is  negligible  because  the 
Litton  fast  multiplier  effects  roundoff  at  double  word  length. 

B.  Roundoff  Error  Resulting  From  Input  Sampling  -  Input  sampling 

errors  are  assumed  to  be  reduced  to  purely  random  errors  by 

elimination  of  converter  bias  errors.  The  20  bit  word  input  of 

19  bits  magnitude  plus  a  sign  bit  are  then  sampled  with  a  re- 

1  -19 

sultant  maximum  error  value  of  j  z  2  of  full  scale.  A  flat 
probability  distribution  implies  that  the  rms  error  of  a  single 
converted  value  is  x  2*^°  of  full  scale.  The  maximum  angular 
rate  for  full  scale  of  inputs  is  adjustable  in  the  strap-down  pro¬ 
cessor,  being  in  the  shipped  computer  0.  43  rad/sec  for  which  the 
rms  angular  rate  error  of  a  single  reading  is 


0.43  x 


1 

7T  * 


-20...  1 


-x 


10  ^  rad/sec 


(11-4) 


The  cumulative  random  error  in  angle  produced  in  an  inertial 
coordinate  may  be  evaluated  on  the  basis  of  summation  at  the 
input  rate  of  1/T  *  2*00  iter/sec  which  is  9  times  the  output  rate. 
The  angular  error  sum  after  a  period  of  operation  '  is 


t/v 


r=  1 


II-5 


which  has  variance 


t/r 

O0*  * 


(II-6) 


n=l  • 


Hence 


°*e  *  t2  i 


(H-7) 


a3n  =  /Ft  c 


(U-«) 


For  t  *  6  hr,  1/t  =  2400.  f  *  1/4  x  10~^  the  ran  error  re- 
suiting  from  input  sampling  is  approximately  3/4  x  10*  rad  —  1/6 
arc  sec,  hence  the  random  input  sampling  error  is  negligible. 

C-  Error  Growth  Resulting  From  Roundoff  of  Multiplier  Inputs  -  The 
hardware  requirements  of  a  whole  word  fast  multiplier  are  es¬ 
sentially  proportional  to  the  word  length  of  the  multiplicand.  A 
substantial  saving  in  multiplier  complexity  was  effected  by  in¬ 
troducing  a  roundoff  operation  on  outputs  of  die  input  accumulator 
unit  (operation  of  which  is  delineated  in  the  section  on  general 
logical  description  of  the  strap-down  processor).  The  basic  in¬ 
puts  to  the  strap-down  processor  are  20  bit  words  of  19  bits 
magnitude  plus  a  sign  bit.  A  constant  input  is  essentially  multi¬ 
plied  by  9  in  pre-processing  summations  and  then  multiplied  by 
•  in  the  input  extrapolation  section  hence  the  input  accumulator 
outputs  before  roundoff  can  be  27  bit  words  of  26  bit  magnitude 
plus  a  sign  bit.  The  roundoff  operation,  producing  a  number  of 
20  bits  if  sufficiently  accurate,  may  introduce  an  acceptable  error 


il-6 


of  the  same  low  level  as  in  input  campling.  The  moat  commonly 
mechanised  roundoff  operation,  however,  can  be  shown  to  be 
inadequate  because  it  introduces  a  bias  of  2  *  where  N  is  the 
number  of  bits  t  ouoded  off.  The  simple  roundoff  operation  con¬ 
sists  of  adding  a  1  in  the  most  significant  bit  rounded  off,  to  the 
truncated  number  in  the  least  significant  bit  position.  The  average 
error  introduced  in  this  operation  is  evaluated  by  assuming  that 
all  possible  numerical  values  of  the  truncated  number  occur  with 
the  same  frequency  during  extended  computer  operation,  and  there¬ 
fore  average  the  individual  errors  which  result  for  each  possible 
number.  The  truncated  number,  regarded  as  in  a  unit  register, 
ranges  from  0  to  (1-2~N)  and  includes  1/2.  The  value  0  introduces 
no  error  when  it  occurs.  The  remaining  successive  values,  on 
each  side  of  1/2  of  each  absolute  magnitude  difference  from  1/2, 
produce  equal  and  opposite  errors.  The  resultant  error  using 
the  simple  roundoff  operation  is  purely  that  introduced  when  1/2 
occurs,  for  which  the  error  is  1/2.  Since  the  number  1/2  occurs 
with  frequency  2  ^  the  long  term  effect  of  the  simple  roundoff 
method  is  a  bias  error  of  2  ”N  times  the  least  significant  result 
of  the  rounded  number.  In  the  strap-down  processor  the  simple 
roundoff  would  create  an  effective  bias  of  2  ~7  of  the  least  signifi¬ 
cant  bit  of  the  multiplicand.  For  full  scale  direction  cosines  the 
least  significant  bit  passes  directly  through  the  multiplier,  thence 
through  the  extrapolator  unit  where  it  is  multiplied  by  12  and  then 
added  at  the  least  significant  end  of  a  32  bit  word  of  the  direction 
cosine  line.  Regarding  the  error  produced  in  the  direction  cosine 
line  as  an  angular  error  in  radians  (since  for  9  small  U  *  cos 
(9  +  90*)  ^  9  the  angular  error  produced  per  iteration  (on  the 
average) is: 


II -7 


(n-9) 


,-3i 
2  x 


12  x  2 


7  =  0.  5  x 


10  rad 


After  6  hours  at  266  iter/sec  the  bias  errors  can  add  up  to  a  total 
angular  error  of 


6(3600)  (260)  0.  5  x  10* 10  =«  3  x  l(f4  rad 

te  1  arc  min. 


(n-io) 


resulting  from  the  use  of  the  simple  roundoff  operation.  The  bias 
error  can  be  removed  using  the  elaborated  roundoff  operation 
based  on  the  fact  that  error  is  produced  when  the  roundoff  section 
of  bits,  regarded  as  in  a  unit  register,  has  value  1  2  in  which  case 
the  addition  of  a  bit  to  the  truncated  number  in  the  simple  roundoff 
is  now  inhibited.  With  effective  bias  error  removed  the  only  error 
effect  remaining  is  the  random  error  effect  analyzed  in  the  pre¬ 
vious  section. 

2.  4  LEAD- LAG  EFFECTS  IN  INPUT  SAMPLING  AND  ASSOCIATED  ANALOGUE 
FILTERS  -  Strap-down  computations  are  extremely  sensitive  to  lead-lag 
effects  in  input  variables  or  processing  action.  The  analysis  of  Phase  I  pre¬ 
sented  in  Chapter  7,  Section  4,  of  the  first  report,*  showed  that  introduction  of 
an  analogue  filter  of  very  short  time  lag  between  sensor  and  digital  computer 
was  necessary  in  order  to  effect  the  second  order  Stieltjes  integration  algorithm 
in  a  strap-down  processor,  with  simplified  mechanisation.  Assuming  the 
strap-down  processor  mechanization  executes  precisely  the  processings  derived 
in  that  analysis  the  time  constant  of  the  analogue  filter  was  shown  to 

*  First  Phase  Technical  Documents r>  Report  on  Development  of  an  Airborne 
HSDDA,  H.  W.  Banbrook,  7  July  19ol.  Contract  AF33(6lb)-6936. 


H-8 


necessarily  be  —  t  where  t  is  the  output  iteration  interval.  In  the  final  mech¬ 
anization  the  processing  was  modified  slightly  to  imply  analogue  to  digital 
conversions  made  by  only  two  converters,  rather  than  six  converters  (one  for 
each  input  variable)  for  a  hardware  saving  in  any  operational  system.  The 
effect  of  serial  rather  than  parallel  samplings  of  inputs  is  to  introduce  leads 
and  lags  in  certain  angular  rate  and  acceleration  inputs.  The  leads  and  lags 
are  A  1  word  times  of  a  1/27  r  which  while  small  would  have  a  serious  effect 
on  computer  accuracy  if  not  corrected  in  an  operational  system.  Such  cor¬ 
rection  made  for  each  input  in  the  associated  analogue  filter,  by  choosing  the 
time  constant  to  be  •—  t  instead  of  ^-t  ,  is  evaluated  since  such  cor¬ 

rection  involves  no  actual  cost  in  mechanization.  A  small  error  effect  in 
the  second  order  algorithm  results  from  making  the  parameter  change  for 
exact  first  order  algorithm.  The  level  of  error  in  second  order  algorithm 
for  the  case  of  1  word  time  phasing  error  is  seen  by  the  following  analysis  for 
the  general  case  of  W  word  times.  Expressed  in  the  delay  operator  form  of 
previous  algorithm  analysis  the  necessary  compensation  of  a  W  word  time 
delay  in  the  27  word  per  iteration  processor  is 

<l-CfX  *  (l  ♦  XC  +  C*) 

W 

where  1  *  — ,  and  C  is  the  I  iteration  delay  operator.  The  required  com¬ 
pensation  for  an  analogue  filter  was  shown  to  be 

1-U«C-U*(}-U*)C8 

where  u*  *  1/K**,  and  K*  the  time  constant  of  the  modified  filter  intended  to 
compensate  the  lag  of  the  sampling. 


11-9 


The  net  algorithm  effect  is  given  by  the  product  of  the  required  compensations, 
which  should  approximate  the  required  compensation  for  the  filter  with  time 
constant  K  =  determined  for  the  W  »  0  case.  Thus  the  net  uncompensated 
algorithm  effect  is  given  by: 


A 


(II- 13) 


The  lag  cancellation  yields  first  order  agreement  as  implied  by  equating  first 
order  terms  in  numerator  and  denominator. 


(II- 14) 


which  on  substituting  for  ,  -*  implies  the  choice 
1  1  A  W- 

~**K  n 


(II-  I*) 


The  net  uncompensated  algorithm  effect  for 


X  + 


is 


A  * 


1-u  C  ♦  )♦'•(£♦-) 


!•  j 


--(i~  ' 


\  e* 

1  (-  +u)W 


(II- to) 


to  second  order 
Substituting  for 


A  , +i!)C2 

A  *  1  27  54  72 ' 


which  for  1  word  time  lag  is 


A  >  1  ♦  0.  0095  C 


(11-17) 


(11-18) 


U-10 


The  compensation  of  second  order  term  to  1  percent  corresponds  to  precision 
improvement  .due  to  second  order  algorithm  relative  to  first  order  algorithm 
of  50  to  1.  Note  that  if  an  additional  filter  were  used  in  series  with  the  re¬ 
quired  filter,  for  which  the  analysis  above  may  be  used,  with  u  *  o,  then 

a:i  +  i5)2*2  <u-i9) 


which  for  *  *  1 

leads  to  A-  1  +  0.004  r 2  (H-20) 

indicating  negligible  second  order  algorithm  error,  using  a  two  filter  per 
input,  mechanisation. 

2.  5  TRANSMISSION  WITHIN  THE  STRAP-DOWN  PROCESSOR  OF  SECOND 
ORDER  ALGORITHM  TERMS  FOR  HIGHER  ORDER  INTEGRATION  ACCURACY  - 
Roundoff  procedures  and  word  length  determine  the  level  of  roundoff  error. 

If  the  roundoff  error  exceeds  the  level  of  algorithm  error  produced  by  neglecting 
second  order  terms  in  the  integration  algorithm  then,  depending  on  the  ap¬ 
plication.  either  the  sophistication  of  higher  order  integration  is  not  justified, 
or  higher  precision  computation  is  required.  Employing  the  most  precise 
roundoff  techniques  it  is  possible  to  compute  effectively,  including  higher 
order  effects,  even  if  they  are  smaller  than  the  resolution.  This  is  seen  in 
the  case  of  first  order  algorithm  computation  in  a  single  increment  DDA  in 
which  the  difference  between  first  order  algorithms  in  accuracy,  in  a  sinusoid 
for  example,  is  very  great,  yet  the  sise  of  x  in  the  algorithm  may  be  much 
smaller  the  single  bit.  In  general  however  tne  noise  in  transmission  of 
small  terms  makes  the  value  of  algorithm  terms  a  significant  number  of  bit 
positions  smaller  and  the  resolution  essentially  nil.  Consider,  for  the  case  of 
the  strap-down  processor,  the  determination  of  the  level  of  magnitude  of 
second  order  algorithm  terms  compared  to  resolutions.  The  second  order 


11-11 


erm»  in  the  sirap-cown  inertial  orientation  computation*  involve  change*  in 
angular  rate  between  iteration*.  A  feature  of  auto- piloted  flight,  especially 
in  the  atmosphere,  is  that  craft  axis  angular  rates  change  at  usually  several 
times  the  angular  rate  of  the  craft  axes.  This  is  explained  by  the  fact  that 
control  of  the  craft  about  a  fixed  chosen  orientation  without  complete  rotations 
requires  that  the  phase  of  angular  rate  changes  must  change  before  the  angle 
of  orientation  has  changed  as  much  as  say  <  1/3  radian.  In  the  Dutch  roll 
case. 

s'  =  a  cos  €  t  (Hi-fi) 

o  o 

where  8*  /*•  *  3  to  6 

o  o 

then  to  i-P*  *  »  sin  8*  t 

©  o  o 

from  which  it  is  deduced  that  (3  a)  max/«>  *  r  _e.  Assume  8  *  4  a in 

max  w  u 

typical  flight  involving  craft  motions  similar  to  Dutch  roll.  Then  the  rmi 
magnitude  of  second  difference  terms  in  direction  cosines  is  related  to  that 
of  first  order  terms  in  a  ratio  of  about  4  >0,  where  a0  is  regarded  as  maximum 
input  angular  rate  and  t  is  processor  output  iteration  interval  In  the  Sue  If  jet 
integration  process  of  the  strap-down  processor  the  indepeader..  variable  is 
angular  displacement,  scaled  as  a  fraction  of  full  input  register,  and  has  a 
maximum  for  0.4  rad /sec  of 

*o  *  ° 

At  0.  1%  of  maximum  angular  rate  the  change  of  angular  rate  per  iteration  is 
less  than  the  resolution  of  the  input  register  having  19  bit  and  sign,  but  is 
expected  to  have  accurate  statistical  transmission,  assuming  precise  sampling 


n-n 


techniques-  The  angular  displacements  (modified  for  algorithm)  leave  the 
Input  accumulator,  and  enter  the  multiplier  wbore  they  are  multiplied  by 
3 1  bit  (pine  sign  bit)  direction  cosines,  the  products  being  ram inded  off  to  19 
bits.  Entering  due  extrapolator,  second  differences  contribute  to  the  output 
according  to  a  magnitude  level  determined  by  due  second  order  terms  of 
angular  displacements*  and  direction  cosines  ([and  products  of  first  order 
terms).  For  fall  scale  of  tbe  direction  cosines  of  1  their  second  differences 


have  ssairiiwinimMi  magmitiwilf 


TH?/)i4(»  1 

O  O  ~ 

t  max 


(E- 14) 


for  E'oSBliauK  *  0-4  rad/sec*  r  *  1/266  sec-  Entering  with  weight  5// 12  in  the 
extrapolator  action  their  magnitude  compared  to  tbe  least  significant  bit  of 
the  19  bit  output  from  which  they  are  computed  is  generally  less  than 


«n-2») 


that  is  4  times  the  resolntion.  Since  the  mechanism  of  tbe  extrapolator  unit 
introduces  no  error  whatever  on  a  perfect  19  bit  inpnt.  tbe  only  source  of 
error  in  second  order  term  transmission  would  be  in  tbe  output  of  the  multi¬ 
plier  (and  earlier  stages  of  computation).  Since  the  multiplier  has  vesv  precise 
roundoff  the  second  order  terms  when  sub-significant  may  be  regarded  as  ffullv 
transmitted  but  contaminated  with  roundoff  noise  in  a  poise  stream  representa¬ 
tion  (in  pulses  of  magnitude  of  tbe  least  significant  hot)  in  which  the  frequency 
represents  the  magnitude-  In  Dutch  roll  (worst  case)  tbe  overall  effect  of 
second  difference  terms  on  inertial  reference  i»  accumulative..  The 
cumulative  effect  ‘(reduced  bv  a  factor  of  two  tor  phase  effects)  re¬ 
sulting  irom  second  order  term*  :n  a  hour  can  be  estimated  bv  taking  mac 


H-13 


accsHnimt  the  scale  with  whuclv  tt be  tennis  are  added  .'>■  the  direction  ccsine  lime 

((a  raior*  detailed  analytic  cl'  Dutch  Moll  is  present td  in  tins  final  report  of 

-31 

Haase  I)),  the  scale  being  I>  x  2  ,  for  which  the  maximum  error  is 


1/2  a  12  x  2 


-31 


=  5x15  radians 


x  ( 2  bits)  x  266  r  36(9  radians 
=  17  arc  main 


(11-26) 


Implementation  of  carefully  chosen  roundoff  procedures  is  called  for  in  com- 

\ 

jputing  with  second  order  secnr.  a.cy  since  tMs  significant  error  effect  can  re- 
smlt  from  terms  contribut-rag  magnitudes  in  19  fc't  numbers  at  the  least  sig¬ 
nificant  bit  positions.  Viewed  purely  from  tbe  standpoint  of  roundoff  error 
the  analysis  of  0B4  showed  that  the  required  care  has  been  taken  in  the  strap- 
down  processor  design. 

2  6  SPECIAL  PURPOSE  MECHANIZATIONS  TOR  AEROSPACE  APPLICA¬ 
TIONS  REQUIRING  STRAP-DOWN  COMPUTATIONS. 

A.  Introduction  -  The  .narked  perform  ante  and  mechanization 

characteristics  of  state  of  the  art  .ensors.  and  certain  low  cost 
transducers  present  constraints  c:  digital  computer  design  for 
aerospace  applications  in  whi:h  computer  weight  and  Altme  are 
significant,  hi  tbe  case  of  strip-down  inertial  refer  :nce  com¬ 
putations  the  relatively  low  accuracy  ‘.evel  of  the  rt  i  r-of-the-art 
rate  gyros  (discussed  in  the  next  sect  on)  imposes  a:  unusually 
marked  design  constraint  The  direct  impact  of  this  constraint 
is  the  implication  of  a  low  cost  transducer,  tbe  p«Ue  stream 
analogue  to  digital  ronverter,  which  has  accuracy  limitations  (dis¬ 
cussed  in  a  later  section)  comparable  *o  the  rote  gyres.  The  im¬ 
plications  of  the  se-  tor  and  transducer  accuracy  constraints  are 


H-14 


the  limitation  to  a  rather  narrow  subset  of  applications  of  strap- 
down  inertial  reference  computations  requiring  modest  accuracy. 
This  fact  implies  an  allocation  of  digital  computation  capability 
of  reduced  degree.  Note  that  the  implication  regarding  computer 
sophistication  depends  on  whether  the  strap-down  computations 
are  the  only  task,  or  merely  one  of  many  tasks  of  the  computer. 

For  applications  in  which  system  analysis  implies  the  need  for 
a  special  purpose  computer  for  simple  strap-down  inertial  re¬ 
ference  computations,  to  be  executed  in  an  individual  unit  in  a 
missile  system  (physically  separated  from  the  main  computer 
system),  the  requirement  for  a  special  purpose  computer  of 
modest  size  and  probably  modest  mechanization  complexity  is 
seen  to  be  clear  cut.  For  applications  in  which  the  strap-down 
inertial  computations  or  similar  computations  comprise  only  a 
part  ot  the  computation  task,  the  mechanization  of  the  computer 
may  necessarily  be  of  at  least  moderate  sophistication  because 
of  the  overall  computation  capacity  require  1.  In  this  section 
pertinent  special  purpose  computer  mechanization  designs  are 
developed  in  contrast  to  the  primary  design  task  consisting  of  the 
design  of  a  full  scale  computer  system  for  a  full  aerospace 
mission.  The  strap-down  processor  was  constructed  on  the  basis 
of  the  analytical  developments  oi  Phase  1  and  provides  a  design 
basis  for  the  special  purpose  computer  with  several  applications. 

It  also  served  as  the  basis  for  the  subsequently  developed  program¬ 
mable  tnput  processing  portion  of  the  primary  design  effort  which 
was  directed  toward  the  development  of  a  full  scale  computer  for 
the  full  aerospace  mission. 

B.  State  of  the  Art  Sensor  limitations  -  State  of  the  art  rate  gyros 

have  accuracy  limitations  on  the  order  of  0.  0  1  percent  of  maximum 


II- 1* 


angular  rate.  Until  markedly  higher  accuracy  angular  rate  sensors 
are  developed,  the  major  application  of  strap-down  inertial  refer¬ 
ence  systems  is  limited  to  short  period  operation  where  very  high 
accuracy  is  not  important.  Certainly,  long  period  airborne  navi¬ 
gation  does  not  presently  fall  into  this  class  of  application.  The 
strap-down  processor  developed  in  this  contract  study  is  capable 
of  executing  computations  for  any  of  the  broad  set  of  applications 
presently  conceived  in  anticipation  of  the  development  of  adequate 
sensors.  The  direct  purpose  of  constructing  the  computer  was  to 
establish  computation  capability  using  the  general  design  techniques 
developed  during  Phase  I. 

The  basic  limitation  of  the  rate  gyro,  as  an  angular  rate  sensor, 
stems  from  the  fact  that  an  analogue  voltage  is  generated  to  pro¬ 
duce  counter  balancing  torque  on  the  rate  gyro  when  the  case  is 
rotating  relative  to  inertial  space  in  order  to  cause  the  gyro  tm 
follow  the  case.  The  source  of  error  is  that  common  to  all 
analogue  electrical  systems.  The  indicated  path  to  major  im¬ 
provement  in  angular  rate  sensors  appears  to  be  in  the  direction 
of  obviating  the  requirement  of  generating  the  torque  signal  by 
analogue  voltages  which  in  turn  must  be  assumed  proportional  to 
the  torque.  It  would  appear  that  sensor  development  must  neces¬ 
sarily  rest  on  avoidance  of  this  error  source  and  must  perhaps 
utilize  some  of  the  dramatic  technical  developments  in  the  field 
of  precision  angle  generation  and  measurement.  An  angular  rate 
device  using  optical  measurements  also  can  obviate  the  analogue 
to  digital  transducer  problem.  Consider,  for  example,  the 
microgon  (the  high  accuracy  angle  encoding  system  which  is  a 
product  of  the  Norden  Division  of  United  Aircraft)  which  is  capable 


of  reading  10  counts  per  turn  of  an  input  shaft  and  of  accurately 

following  shaft  angular  rates  corresponding  to  175,00 0  counts  per 

second.  An  angular  rate  of  1  radian  per  second  can  be  read  by 

-4 

the  Microgon  to  0.  05  x  10  which  is  at  least  20  times  more 
accurate  than  inertial  angular  rate  defined  by  a  rate  gyro.  The 
Microgon  output  being  a  whole  word,  it  is  ideal  in  this  respect 
for  the  strap-down  processor  input.  The  unsolved  problem  of 
nuking  the  input  shaft  of  the  Microgon  maintain  nil  rotation  with 
respect  to  inertial  space  in  more  than  one  degree  of  freedom  is, 
however,  an  obstacle  in  the  way  of  a  definite  solution. 

C.  Transducer  Performance  Limitations  and  Mechanization  Costs. 

1.  Performance  of  the  Pulse  Stream  Analogue  to  Digital  Con¬ 
verter  -  The  pulse  stream  type  of  transducer  has  relatively 
simple  mechanization  compared  to  that  of  the  whole-word 
sampler  (of  say  14  to  20  bits).  The  information  transmission 
capability  of  the  former  can  be  far  less  unless  the  pulse  rate 
greatly  exceeds  the  sampling  rate.  The  quantitative  com¬ 
putation  error  associated  with  the  pulse  stream  transducer 
depends  on  the  required  information  transmission  of  the 
application  as  well  as  the  capability  of  the  device.  Thus, 
the  band  limited  character  of  inputs  implies  the  possibility 
of  better  performance  than  would  otherwise  be  expected. 

The  use  of  the  pulse  stream  transducer  for  a  specific  appli¬ 
cation  is  tempered  by  the  effective  accuracy  in  relation  to 
state  of  the  art  sensor  accuracy.  The  lead-lag  effects  of  a 
bias-corrected  pulse  stream  transducer,  which  result  from 
the  ambiguity  of  sensor  outputs  in  pulse  stream  representa¬ 
tion,  are  possibly  serious  for  strap-down  comfaitations 


11-17 


because  of  the  sensitivity  of  the  latter  to  such  effects. 

These  effects  are  exaggerated  at  very  low  rates,  indeed, 
for  motion  within  an  angular  range  of  the  resolution  they 
are  capable  of  periods  of  bias  on  the  order  of  the  resolution. 

2.  A  Preliminary  Theory  of  Pulse  Stream  Transducer  Error 
Effects  Applicable  to  Strap- Down  Computations  -  Extensive 
DDA  sinusoid  simulations  during  Phases  I  and  n  of  this 
contract  study  led  to  the  formulation  of  a  theory  for  a 
particular  roundoff  effect  which  is  the  major  type  of  round¬ 
off  error  for  simiaoid  computations.  The  overflow  inhibitor 
design  technique  led  to  much  less  frequency  infidelity  (re¬ 
lative  theoretical  frequency)  of  sinusoid  calculations.  The 
error  effect  takes  place  when  the  output  rate  of  a  rounded 
variable  is  near  null.  A  theoretically  derived  formula  for 
the  effect  which  is  in  quantitative  agreement  with  micro  and 
macro  error  magnitudes  of  sinusoid  computations  is  applied 
in  Chapter  X  in  estimating  roundoff  error  in  general  DDA 
calculations.  The  pulse  stream  transducer  is  a  roundoff 
device  which  may  induce  error  effects  (among  others)  of  the 
same  nature  as  that  evaluated.  Thus,  the  design  principle 
of  the  overflow  inhibitor  could  be  applied  to  substantially 
improve  the  performance  of  a  system  employing  the  pulse 
stream  transducer,  especially  for  the  strap-down  application. 

3.  Digital  Computer  Mechanisation  for  Analogue  to  Digital  Con¬ 
verter  -  A  whole  word  sampler  is  assumed  in  the  con¬ 
structed  strap-down  processor.  Assuming  the  use  of  the 
pulse  stream  transducer  a  significant  implication  in  the 
digital  computer  design  results  because  of  the  pulse  rate 


n-u 


limit  cf  t„e  state  of  the  art  devices.  Thus,  for  a  strap-down 
process c  ‘  with  an  iteration  rate  of  400>t./**cand  a  pulse 
stream  transducer  with  a  maximum  2x10*  pulses /sec.  the 
maximum  number  of  pulses  per  iteration  is  50.  the  sum  of 
which  may  be  represented  by  a  6  bit  binary  number.  As  a 
result  if  the  character  of  the  strap-down  computation  the 
digital  multiplier  unit  appropriate  for  this  application, 
without  intermediate  quantitiaation  (which  lowers  accuracy) 
is  a  6  bit  multiplier.  The  relatively  large  granularity  im¬ 
posed  by  the  pulse  stream  transducer  obviates  input  pre¬ 
processing  (appropriate  for  the  whole-word  transducer). 

The  pulse  stream  transducer  can  be  mated  with  the  strap- 
down  processor  by  a  summation  register  reset  at  each 
iteration  to  zero,  the  contents  being  processed,  thereafter, 
in  the  same  manner  as  sums  of  nine  samples  formed  at  high 
rate  in  the  preprocessing  of  whole-word  inputs  in  the  bread¬ 
board  strap-down  processor.  A  hybrid  transducer  with  pulse 
stream  and  short  word  inputs  could  be  mated  to  the  bread¬ 
board  strap-down  processor  to  retain  the  major  mechaniza¬ 
tion  simplification  of  the  pulse  stream  transducer  which  is 
an  overall  reduction  of  converted  word  length.  A  hybrid 
mechanization  which  utilizes  the  pulse  stream  in  parallel  with 
a  several  bit  analogue  to  digital  conv.jsionof  the  sub  pulse 
analogue  signal  which  triggers  the  pulse  output  is  possible. 
The  merit  of  such  a  transducer,  of  intermediate  complexity, 
is  the  reduction  of  lead- lag  sources  in  digital  computation 
resulting  with  increase  of  effective  word  length. 


II-  lo 


2.  7  SPECIAL  PURPOSE  STRAP-DOWN  COMPUTER  DESIGN  TECHNIQUES 
IMPLIED  BY  STATE  OF  THE  ART  SENSORS  AND  TRANSDUCERS  -  Previous 
discussions  concluded  that  a  special  purpose  strap-down  computer  for  aerospace 
application  need  not  achieve  a  computer  accuracy  comparable  to  that  required 
for  long  term  inertial  navigation,  specifically  because  of  state  of  the  art  sensor 
accuracy  limitations  which  limit  applications  to  those  with  corresponding  accu¬ 
racy  requirements.  The  relatively  high  error  sensitivity  of  the  strap-down  com¬ 
putations  (assuming  no  sensor  error)  however,  implies  a  computer,  even  for 
present  special  purpose  computer  applications,  of  distinctly  greater  computation 
capability  than  state  of  the  art  incremental  computers.  The  design  of  the 
constructed  strap-down  computer  may  be  tailored  (for  these  special  purpose 
applications)  to  supply  the  required  computation  capability  in  a  mechanization  of 
minimum  complexity.  The  straightforward  tailoring  procedure  consists  of: 

1.  Modifying  the  input  accumulator  to  absorb  pulse  stream  transducer 
information. 

2.  Reducing  the  multi -increment  computation  bit  length,  and  obtaiidng 
further  hardware  economy  by  mechanizing  the  simplified  multiplier, 
for  which  a  design  technique  was  developed  during  Phase  I. 

3.  Reducing  the  complexity  of  the  integration  algorithm. 

4.  Reducing  the  word  length  to  correspond  to  the  resolution  of  ti.e  input 
pulse  information. 

Specific  problems  associated  with  these  procedures  are: 

1.  Input  Accumulation  With  Pulse  Stream  Transducer  Inputs  -  Tne 

pulse  stream  transducer  supplies  pulses  of  angular  change  (prefer¬ 
ably  at  a  maximum  rate  which  is  high  for  precision)  to  a  computer 
with  lower  iteration  rate  (for  simpler  mechanization  widen  avoids  a 
high  degree  parallel  processing).  The  appropriate  input  accumulator 
may  have  an  identical  processing  form  to  that  for  whole-word  sampling, 
however,  it  has  (higher)  pre-processing  rate  (an  integral  multiple 
of  output  rate)  which  just  exceeds  the  maximum  input  pulse  rate,  tnus 
obviating  appreciable  buffering  of  inputs. 


Q-20 


2. 


Multi-Increment  Bit  --.ength  -  Selection  of  inputs  as  indepei.dent 

variables  of  integration  implies  that  the  multiplier  must  have  the 

capability  of  executing  multi -transfer  of  bit  length  log  t/t  .  The 

£  * 

arithmetic  unit  may  have  fast  multiplier  mechanization  for  this  bit 
length  requiring  one  word  time  per  multiplication  or,  if  multi¬ 
transfer  bit  length  (Jog^  >5,  it  may  have  a  multi-pass  multiplier 

which  may  be  more  economically  mechanized.  The  latter  has  fast 
multiplication  capability  which  is  only  a  fraction  of  the  required 
multi-transfer  bit  length  for  one  word  time  multiplication,  and  con¬ 
sequently  implies  a  reduced  processor  iteration  rate  relative  to  that 
obtainable  by  the  more  costly  multiplier.  The  case  of  pulse  stream 
transducer  with  the  maximum  rate  of  about  10*  pulses/  sec  for  maxi¬ 
mum  angular  rate  of  1  rad /sec  implies  for  inertial  reference  at 
inertial  velocity  computation  with  the  same  bit  rate  as  the  constructed 
input  processor  i.  e.  a  multi -transfer  bit  length  of  5  bits,  a  word 
length  of  It  bits  and  an  iteration  rate  of  S00  it/sec  using  a  fast  multi¬ 
plier.  A  two-pass  three-bit  multi -transfer  multiplier  computes  at 
250  it /sec  with  the  same  resolution. 

3.  Simplified  Integration  Algorithm  -  The  inherent  errors  of  t-ie  state 
of  the  art  sensors  and  pulse  stream  transducers  make  computation 
with  precise  first  order  algorithm  the  natural  cl>oice  in  further 
simplifying  mechanization.  A  less  than  first  order  algorithm,  sue.* 
as  that  employing  old  y  instead  of  new  y  algorithm  when  the  latter  is 
appropriate,  would  seriously  degrade  overall  accuracy.  In  a  special 
purpose  computer,  based  on  the  input  processor,  certain  dela>  lines 
and  adders  are  left  out  in  realizing  the  simplified  algorithm  with 
some  hardware  saving. 

2.8  STRAP-DOWN  PROCESSOR  MECHANIZATIONS  TAILORED  TO  SPECIAL 
APPLICATIONS  -  As  a  result  of  differing  strap-down  processor  accuracy  and 
unit  weight /volume  requirements  for  different  applications,  three  s>  stems 
have  been  evolved  each  of  which  is  appropriate  for  the  particular  class  of 


11-21 


applications  £ or  which  it  was  designed.  The  three  computers  are  briefly  de¬ 
scribed  as  below,  and  the  minimal  version  is  shown  in  Figure  2-1. 

A.  Minimal  Equipment  Strap-Down  Computer. 

1.  Performance:  Reference  error  s.  1*.  velocity  error  4  2  ft/sec 
(in  15  minutes)  for  launch  phases  of  missile  guidance. 

2.  Analogue  to  Digital  Converter:  Angular  and  velocity  increment 
pulse  stream  type  witn  0.00 2*  granularity. 

3.  Integration  Algorithm:  One  level  higher  accuracy  than  conven¬ 
tional  DDA  algorithm. 

4.  Multi-Transfer:  15  bit  multi-transfer  operation  by  5  bit  lmulti- 
pass)  multiplier. 

5.  Memory:  Delay  line  (or  drum). 

6.  Iteration  Rate:  500  it/sec  witn  delay  line  (100  it/sec  with  drum) 
memory. 

7.  Volume:  1/4  cubic  foot  with  delay  line  {1  cubic  foot  with  drum) 
memory. 

B.  Full-Scale  Aerospace  Strap  -Down  Computer. 

1.  Performance:  Reference  error  *.05*.  velocity  error  4  2  ft/sec 
for  launch  and  re-entry  phases  (in  which  a  pre  re-entry  spot 
reference  correction  is  made  by  stellar  or  other  means) 

2.  Analogue  to  Digital  Converter:  Whole-word  sampler  of  aneuiar 
rates;  granularity  0.  0004*. 

3.  Integration  Algorithm:  Two  levels  higher  accuracy  then  con¬ 
ventional  DDA. 

4.  Multi -Transfer:  18  bit  multi -transfer  operation  by  6  bit  (multi¬ 
pass)  multiplier. 


12-22 


5.  .  Memory:  Delay  line  (or  drum). 

Iteration  Rate:  500  it /sec  A'ith  delay  line  (100  it/»ec  with  drum) 
memory. 

7.  Volume:  1  3  cubic  foot  with  delay  line  (1  cubic  foot  with  drum) 
memory. 

C.  Airborne  Strap-Down  Computer  (effects  same  computation  as  th<- 

Breadboard  model  now  being  fabricated). 

1.  Performance:  Reference  error  -  7  sec.  velocity  error  *  0.  1  ft/ 
sec  in  1  hour  of  airborne  flight. 

2.  Analogue  to  Digital  Converter:  Whole-word  sampler  of  angular 
rates,  accelerometer  inputs;  granularity  0.  0001  . 

3.  Integration  Algorithm:  T  vo  levels  higher  accuracy  than  con¬ 
ventional  DDA. 

4.  Multi-Transfer:  20  bit  multi-transfer  operation  by  10  bit 
(multipass)  multi  lier. 

5.  Memory:  Drum. 

t.  Iteration  Rate*  130  it  sec. 

7.  Volume*  1  cubic  foot. 


11-23 


Multiplier  Unit 


Subtractar 


[ 


re  2-  i 


Minima]  Equipment  St  rap-  Do\*  n  Computer 


X  irmng  Log.c 


z.  9  LOGICAL  DESCRIPTION  OF  THE  STRAP-DOWN  PROCESSOR. 


A. 


Introduction  -  TVe  strap-down  processor  is  a  special  purpose 
computer  which  solves  the  following  equations: 


dt 

A  V. 
J 


W3Uj2  “  W2Uj3 


w  u  -  w  u 

1 j3  3  jl 


W2Ujl  -  WlUj2 


A  U.,  +  A  U.,  ♦  A,U., 
I  jl  2  j2  3  j3 


1.  2.  3 


UI-27) 


The  angular  velocities  (W^,  W^,  W^)  and  accelerations  (A^.  A^.  A^) 
are  inputs  to  the  computer  from  a  magnetic  tape.  Each  input  has 
twenty  bits  including  sign.  The  direction  cosines  which  are  com¬ 
puted  are  recorded  on  magnetic  tape  and  are  also  routed  back  into 
the  computer  to  be  used  for  further  computations.  The  velocities 
are  recorded  on  the  output  tape  but  they  are  not  used  in  the  computer. 

An  iteration  of  the  computer  consists  of  the  following  cycles.  The 
input  data  is  transferred  from  the  input  tape  into  a  core  buffer 
memory  unit.  Then  it  is  entered  into  the  computer  and  operated 
on  to  solve  the  equations  listed  above.  The  results  are  then 
recorded  on  the  output  tape  and  also  recirculated  on  the  magnetic 
drum  to  be  used  in  the  following  iterations. 


11-27 


The  logical  equations  are  expressed  in  a  six-letter  format.  The 
unit  designation  (2nd  letter)  is  as  follows: 

A  Input  Accumulator 

C  Control 

E  Extrapolator 

I  Tape,  Core,  and  MCU  inputs 

M  Multiplier 

N  Input 

P  Control  Panel 

T  Output 

The  necessary  reading,  writing  and  control  circuits  are  included 
in  the  computer  to  allow  the  magnetic  tape  units  to  be  connected 
directly  to  the  computer  without  any  external  control  units.  Either 
IBM  727  or  IBM  729  U  tape  units  car.  be  used. 

Input  Unit  -  The  input  unit  consists  of  the  tape  input  amplifiers, 
the  tape-to-core  buffer  register,  the  core  output  buffer  register, 
the  serial  input  flip  flops  and  two  parity  check  circuits. 

Signals  from  the  input  tape  are  in  an  XRZ1  form,  a  change  in  mag¬ 
netic  flux  indicating  a  binary  "one  ".  The  signal  is  put  into  boti.  an 
inverting  amplifier  and  a  non-inverting  amplifier,  and  will  altemateh 
be  read  from  these  two  circuits.  Both  circuits  will  not  have  a  true 
output  at  the  same  time. 

The  information  on  the  tape  is  put  into  the  tape-to-core  buffer 
register  (FNWOO-FNW06).  Because  of  skew  on  the  tape,  all  seven 
bits  of  information  might  not  be  recorded  on  the  same  pulse,  so  as 


soon  as  any  bit  in  the  tape-to-core  buffer  register  is  turned  on  the 
input  counter  begins  to  count  from  zero  to  seven.  If  tbe  instruction 
register  is  set  to  "fill  core,  "  the  information  is  written  into  the 
core  on  the  count  of  five.  If  the  instruction  register  is  set  to 
"ready  for  instruction"  then  the  instruction  register  is  set  from 
the  tape -to-core  buffer  register  on  the  count  of  seven.  The  tape- 
to-core  buffer  register  is  reset  on  the  count  of  seven. 

Information  is  read  from  the  core  into  the  core  buffer  register  under 
control  of  the  input  coiaiter  during  the  fill  ’.rum,  set  address  and 
compute  instructions.  From  the  core  output  buffer  register  the 
information  is  shifted  into  the  serial  input  flip  flops  and  then  into 
tbe  input  summing  registers  in  the  input  accumulator  unit-  Each  of 
the  instructions  will  be  explained  in  detail  in  another  section. 

Parity  is  checked  (see  Figure  2-2) both  at  tbe  tape-to-core  buffer 
register  and  the  core  output  buffer.  Tbe  tape  check  can  stop  the 
computer  under  control  of  the  operator.  This  will  be  explained  in 
more  detail  in  another  section.  The  core  output  parity  check  will 
turn  on  a  light  if  an  error  is  indicated  but  computer  operation  will 
not  be  interrupted. 

C.  Input  Accumulator  Unit  -  The  input  accumulator  'Figure  2-3)  accepts 
accelerometer  data  and  angular  rate  data  from  the  serial  input  flip 
flops  in  tbe  input  unit.  There  are  three  accelerometer  inputs  (A^*, 
A^*,  Aj*J  and  three  rate  gyro  inputs  (*j*.  During  a 

compute  cycle  of  27  word  times,  which  will  be  denoted  by  T,  inch 
input  is  summed  into  the  input  summing  register  once  every  third 
word  time.  If  the  first  value  is  W.*(t  ♦  T/9Hor  A.  *  (t  +■  T/9))  then 

i  i 

the  final  sum  becomes 

W.*(t  ♦  T/9)  »  W.*(t  ♦  2T/9)  ♦  . . .  +  W.«<t  +  8T/9)  ♦  W.*(t  +-  T)  * 

S<t  -  T) 


iU-ZS) 

11-29 


Tape 

Parity 

Check 


Figure  2-2*  Input  Unit 


n-so 


R;tic 


11-31 


Figure  2-1.  Input  Accumulator  Unit 


The  sum  S(t  +  T)  goes  tc  a  27  word  drum  line  (WAW02),  to  a  three- 
bit  delay  line  (FAD00-FAD02)  and  into  an  adder  (DAA02).  The 
output  <£  the  27-word  line  (FAR02)  is  S(t),  the  output  of  the  three- 
bit  delay  is  8S(t+TXand  the  output  of  DAA02  is  8S(t+  T)  +  S(t+  T) 

=  9S(t  +  T).  FAR02  is  subtracted  from  DAA02  in  DAA03.  This 
gives  9S(t+  T)  -  S(t)  which  is  and  A..  These  values  are  put 
into  a  one-word  delay  line  (WAW03)  at  the  proper  word  times  and 
the  output  of  this  line  is  the  multiplicand  input  to  the  multiplier 
unit.  The  flow  of  information  is  shown  in  Figure  2-4. 

The  computer  word  length  is  32  bits  and  the  input  word  length  is 
20  bits.  The  maximum  length  of  9S(t+  T)  -  S(t)  is  27  bits,  and 
must  be  rounded  off  before  being  entered  into  the  multiplier  unit 
which  has  20  bit  registers. 

The  output  of  FAR03  is  shown  in  Figure  2-$. 

The  round  off  is  performed  as  follows.  If  bit  6  is  aero  enter  bits 
7-26  as  they  are.  If  bit  6  is  a  one  and  any  of  bits  0-5  are  also  a 
one,  add  one  to  bits  7-26.  If  bit  6  is  a  one  and  bits  0-5  are  all 
aero  then  the  multiplicand  is  increased  by  one  only  if  bit  7  is  a  one. 

D.  Multiplier  Unit  -  The  multiplier  unit  (Figure  2-6)  -onsists  of  three  twenty 
bit  registers,  a  twenty -bit  parallel  adder,  a  two's  <-omplement  carry 
flip  flop  (FMC00),  and  a  multiplier  input  flip  flop  (FMQ01).  This 
unit  multiplies  a  twenty-bit  multiplicand  by  a  thirty-two  bit  multi¬ 
plier  each  word  time.  The  output  is  a  twenty-bit  product  which  is 
placed  in  the  least  significant  end  of  a  thirty-two  bit  word.  The 
multiplication  process  is  basically  that  described  by  Booth  and 
Booth.  * 


*  Booth,  A.  D. ,  and  Booth,  K.  H.  V. ,  Automatic  Digital  Calculators,  Academic 
Press,  pp.  45-48,  1956. 


II-  32 


WORD 


DA  A  00 


DAA01 


DA  A  02 


FAR  03 


31 


27  26  25 


Sign 


Sign 


MSB 


LSB 


Multiplicand 


Figure  2-5.  Output  of  FAR  03. 


-33 


Figure  2-6.  Multiplier  Unit 

The  multiplicand  input  is  entered  serially  into  the  B- register  from 
the  input  accumulator  unit  during  bit  times  7-25.  During  bit  times 
26-30  the  B- register  remains  static  with  the  first  19  bits  of  the 
new  multiplicand  in  bits  1-19  and  the  sign  of  the  last  product  is  bit 
zero.  This  allows  the  output  to  be  in  the  form  shown  below  in  Figure 
2-7, 


Sign  MSB  LSB 


Figure  2-7,  Multiplier  Output 

At  bit  time  31  the  complement  of  the  sign  bit  of  the  multiplicand  is 
entered  into  FMM19  and  the  complement  of  the  B-register  is  shifted 
into  the  M- register  shifted  right  one  bit.  This  puts  the  complement 
of  the  multiplicand  into  the  M-register  and  the  multiplier  unit  is 
ready  for  the  next  multiplication.  As  is  shown  in  Figure  24  ,  the 
multiplicand  does  not  change  every  word  time.  The  new  multiplicand 


II- 34 


Word  Time 

M>/.t  plicand 

Multiplier 

0 

W1 

U13 

1 

W1 

U23 

2 

W1 

U33 

3 

W1 

U12 

4 

W1 

U22 

5 

W1 

U32 

6 

W2 

U11 

7 

W2 

U21 

8 

W2 

U31 

9 

W2 

U13 

10 

W2 

U23 

11 

W2 

U33 

12 

W3 

U12 

13 

W3 

U22 

14 

W3 

U32 

15 

W3 

U11 

16 

W3 

U21 

17 

W3 

U31 

18 

A3 

U13 

19 

A3 

U23 

20 

A  3 

U33 

21 

A  2 

U12 

22 

A  2 

U22 

23 

A  2 

U32 

24 

A1 

U11 

25 

A1 

U21 

26 

A1 

U31 

Figure  2-8  .  Multiplication  Schedule 


is  brought  in  as  described  above  only  during  word  times  5,  11,  17, 

20,  23  and  26.  During  all  other  word  times  the  multiplicand  is 
already  in  the  M- register,  either  in  true  or  complemented  form. 

If  it  is  in  complemented  form,  the  two's  complement  flip  flop  (FMC00) 
will  be  on.  If  it  is  not  in  this  form  then  the  M- register  is  comple¬ 
mented  and  FMC00  is  turned  on  at  bit  time  31.  The  reason  for  this 
is  that  in  the  Booth  and  Booth  multiplication,  a  subtraction  is  the 
first  operation. 

The  multiplier  inputs  come  from  two  read  heads  on  a  drum  line  in 
the  output  unit.  The  correct  value  is  at  FTROS  during  word  times 
0-17  and  at  FTR06  during  word  times  0-2  and  6-26.  It  is  read 
from  FTROS  during  word  times  0-17  and  from  FTR06  during  word 
times  6-26. 

The  contents  of  the  output  read  flip-flop  (FTR05  or  FTR06)  and 
FMQ01  control  the  operation  of  the  multiplier  unit.  If  an  addition 
or  subtraction  is  to  be  done,  the  adder  output  is  transferred  to  the 
accumulator  on  the  half  clock.  Then  the  accumulator  is  shifted 
right  on  the  master  clock.  Thesignbitis  shifted  into  bit  18  and  the 
sign  remains  unchanged  at  all  bit  times  except  bit  time  11.  During 
this  bit  time,  "  one  '  is  effectively  added  to  bit  18  by  shifting  the 
complement  of  the  sign  into  18,  and  resetting  the  sign  to  zero. 

Since  twenty  shifts  remain,  the  added  'one"  at  bit  time  1 1  has  the 
effect  of  rounding  off  the  product. 

Extrapolator  Unit  -  The  output  of  the  multiplier  unit  goes  to  the 
extrapolator  unit  (Figure  2 -  °  )  where  it  goes  through  an  extrapola¬ 
tion  by  T/2  and  a  multiplication  by  12.  If  the  input  to  the  extrapola¬ 
tor  unit  is  y(t)  then  the  output  is: 


iz|y(t)  +  l/2Ay(t)  +  5/12A 


23y(t>  -  16y(t-l )  +  5y(t-2)  (11-29) 


The  multiplication  by  12  is  included  to  avoid  a  division  by  12.  Figure 
2-  ;  shows  the  mechanization  of  the  extrapolator  unit. 

Output  Unit-  The  output  unit  (Figure  2-10)  accepts  inputs  from 
DEA04  and  processes  t^em  on  three  drum  lines  as  shown  in  the 
schedule  of  Figure  2-11.  The  output  from  OEA04  is  a  multiplication 
of  two  variables  and  these  must  be  combined  to  form  the  following 
equations : 


dUil 

-  W  IT  .  W  IT 

dt 

3  j2  2  j3 

dUj2 

.  w  IT  .  W  IT 

dt 

-  WjUj3  W^U.J 

dUj3 

-WIT  .WIT 

dt 

-  w  u  -  v»  u. . 

2  jl  1  j2 

2V. 

J 

3  Vjjl  *  A2Ui2  * 

j  =  1.  2.  3 


(11-30) 


After  the  above  equations  have  been  formed,  they  must  be  available 
for  use  as  a  multiplier  input  (the  direction  cosines  only)  at  the  cor¬ 
rect  word  times,  and  they  mus:  be  recorded  on  the  output  tape. 


The  direction  cosines  must  be  fed  back  into  the  multiplier  unit  in 
the  order  indicated  in  the  description  of  the  multiplier  unit.  The 
correct  variable  is  at  FTR05  during  word  times  zero  through  17 
and  at  FTR06  during  word  times  six  through  -6. 

The  information  that  is  recorded  on  the  output  tape  is  first  put  on 
the  one-word  drum  line  through  WTW02  where  it  is  recirculaied 
for  at  leas',  three  word  times.  The  drum  line  inputs  are 
shown  in  Figure  ■7  -  12.  The  tape  accepts  a  character  of 


Multiplier 


11*  M 


T 

• 

00 

o 

oC 

H 

U< 

— '  «M  «M  m  M  -  « 

«  M  «  -  M 

D  D  D  D  D  D  D 

...  .  j 

• 

• 

oo 

o 

oC 

H 

In 

• 

• 

*2  «*>  -*  «M  * 

2  >  >  >  W  N 

<  <  <  4  o  D 

^  o 

2s 

Clx 

DDDDDDDDDD>>>DDDDDDDD3DDDDD 

US 

lew 

(Tl-9) 

FTR04 

lUl^lillJJIUllJIlilJll] 

P  2 

a  K 

°  H 
t-U. 

M  N  N 

W  N  <*> 

DPP 

«M  «M  <M 

<  <  < 

-  N  n  -  N  n  -NaHxai«-Mni,M,;»Mn 

r»3333P»rr»2333333333=>ODSaDr^ 

*3 

a  at 

£  H 
t  h. 

NNN 
—  (M  1*1 

DPP 

NNN 

<  •<  < 

«*>  i*  i*>  »*a  «*i  m - i*>  i*\  »*>  rsi  ^  <M 

xN«|xN«-N«_nS-nS  -Nan-NKI  M 

"P3PP033PPaaaa33"N,'3PPPD3"s 

4****»***»<<<<<<<4<*>****<14 

T  s 

i* 

tc 

NNN 
-*  N  IN 

P  P  P 

NNN 
<<  < 

2 " 2  2  £  2  2  2J3S25J3  2 3 3 2 3 3 3 3 

3aaaaaarf!'P9  3P3PrN«33  3PP3PP 

?! 

tS 

~S! 

3  S3 

5*  2< 

X  n  m  m  n  N  N  «rtn«ain---|,i<,|(i  2  A 

2  -N«-N«  -Nrt-Nn-NgSairtiN 

a^n333333*N«33333333333333 

B 

-222233-SS22223«~^«222223-n 

a93333P3S33P3P9PP9333333333 

11-40 


Figure  2- 1 1.  Information  Flow  in  Output  Unit 


Word 


WTWOO 


WTV.'Ol 


WTW02* 


WTW02** 


mm 

DEA04+FTR00 

FTR04 

FTR08 

FTR08 

DEA04 

1 

FTR08 

FTR04 

FTROO 

FTR08 

FTR08 

FTR08 

FTR05 

. 

1 

FTR08 

FT 

R01 

DEA04- TTR00+FTR07 

FTR01 

1 

FTR01 

FTR06 

10 

DEA04 

FTR00+FTRO7 

FTROO 

FTR08 

11 

DEA04 

FTR08 

FTR08 

12 

DEA04 

( 

> 

FTR06 

13 

FTROO 

DEA  04-  FTR00+FTR07 

FTR08 

14 

FTR08 

15 

► 

\ 

\ 

FTR06 

16 

DEA  04 

FT! 

R04 

FTR08 

17 

FTROO 

18 

FTR08 

i 

19 

FT 

R02-FTR00+FTR07 

FTR08 

FTR06 

20 

FTR05 

FTR08 

21 

< 

FTR08 

22 

DEA  04+ FTROO 

FTR04 

FTR08 

23 

FTR04 

24 

FTR08 

25 

FTR08 

26 

FTR05 

— 

J - 

===== 

•Compute 

••1st  2?  word*  after  compute 


Figure  2-12.  Drum  Line  Inputs 


11-41 


•even  bit*  in  parallel  at  the  rate  of  two  character*  per  word  time. 
A  32  bit  word  is  divided  into  six  characters  each  containing  six 
bits  of  information  and  one  parity  bit,  and  it  is  recorded  on  the 
tape  during  three  consecutive  word  times.  Four  of  the  bits  are 
recorded  twice.  For  each  iteration  there  are  13  words  of  informa 
tion  to  be  recorded  on  the  output  tape:  one  address,  9  direction 
cosines,  and  three  velocity  increments.  This  is  done  within  the 
compute  cycle  and  the  first  27  words  after  the  compute  cycle. 
During  each  of  these  54  word  times  a  character  is  recorded  at  bit 
time  four  and  bit  time  20.  The  order  in  which  the  information  is 
output  can  be  seen  in  Figure  2*11  as  the  output  of  FTR08. 

The  output  of  the  one-word  drum  line,  FTR08,  is  continuously 
being  transferred  into  FTS05-FTSOO  as  shown  in  Figure  2-10. 
Then,  at  the  proper  times,  FTS00-FTS05  are  transferred  in 
parallel  into  FTT00-FTT05.  A  parity  bit  is  generated  in  DTP02 
and  then  the  character  is  recorded  on  the  tape  either  at  bit  time 
four  or  bit  time  20.  Figure  2-13  shows  the  organisation  of  the 
output  word.  Note  that  bits  5,  10,  21  and  26  are  each  output  twice 
This  fills  cut  the  56  bits  of  information,  in  the  six  characters. 


FTS(n) 

FTT(n) 

Output  Pulse 

Output  Word 

Bit  Time 

Word  Time 

Bit  Time 

Word  Time 

Bits 

6 

n 

20-21 

n 

0-5 

0 

n+ 1 

4-5 

n+ 1 

26-31 

16 

at 1 

20-21 

at 1 

10-15 

22 

at 1 

4-5 

at  2 

16-21 

11 

n+2 

20-21 

at  2 

5-10 

27 

n+2 

4-5 

nt3 

21-26 

n-42 


Figure  2-l'3.  Output  Organisation 


G.  Control  Unit  -  The  control  unit  is  made  up  of  a  number  of 

flip  flops  which  are  used  to  control  the  cycles  in  all  the  other 
units,  the  amplifiers  which  specify  certain  conditions  of  the 
flip  flops, and  the  clock  system. 

1.  Bit  Counter  (FCB00-FCB04)  -  The  bit  counter  is  a 
binary  counter  which  counts  from  0  to  31  stepping 
one  count  on  every  master  clock  pulse.  The  states 
of  the  counter  are  shown  in  Figure  2-14. 


11-43 


FCB01 


FCBOO 


State  FCB04  FCB03 


FCB02 


0  0 

1  0 

2  0 

3  0 

4  0 

5  0 

6  0 

7  0 

8  0 

9  0 

10  0 

11  0 

12  0 

13  0 

14  0 

15  0 


16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 


0 

0 

0 

0 

0 

0 

0 

0 


0 

0 

0 

0 

0 

0 

0 

0 


0 

0 

0 

0 

1 

1 

1 

1 

0 

0 

0 

0 

1 

1 

1 

1 

0 

0 

0 

0 

1 

1 

1 

1 

0 

0 

0 

0 

1 

1 

1 

1 


0 

0 

1 

1 

0 

0 

1 

1 

0 

0 

1 

1 

0 

0 

1 

1 

0 

0 

1 

1 

0 

0 

1 

1 

0 

0 

1 

1 

0 

0 

1 

1 


0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 


Figure  2-  14.  Bit  Counter 


i*  Word  Counter  (FCW00-FCW05)  -  The  word  counter  counts  from 

0  to  26  as  shown  in  Figure  2>15<  It  cotints  up  one  state  every  time 
the  bit  counter  is  in  state  31 . 


FCW05 

FCW04 

FCW03 

FCW02 

FCW01 

FCWOO 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1 

2 

0 

0 

0 

0 

1 

0 

3 

0 

0 

0 

1 

0 

0 

4 

0 

0 

0 

1 

0 

1 

5 

0 

0 

0 

1 

1 

0 

6 

0 

0 

1 

0 

0 

0 

7 

0 

0 

1 

0 

0 

1 

8 

0 

0 

1 

0 

1 

0 

9 

0 

1 

0 

0 

0 

0 

10 

0 

1 

0 

0 

0 

1 

11 

0 

1 

0 

0 

1 

0 

12 

0 

1 

0 

1 

0 

0 

13 

0 

1 

0 

1 

0 

1 

14 

0 

1 

0 

1 

1 

0 

15 

0 

1 

1 

0 

0 

0 

16 

0 

1 

1 

0 

0 

1 

17 

0 

1 

1 

0 

1 

0 

18 

1 

0 

0 

0 

0 

0 

19 

1 

0 

0 

0 

0 

1 

20 

1 

0 

0 

0 

1 

0 

21 

1 

0 

0 

1 

0 

0 

22 

1 

0 

0 

1 

0 

1 

23 

1 

0 

0 

1 

1 

0 

24 

1 

0 

1 

0 

0 

0 

25 

1 

0 

1 

0 

0 

•1 

26 

1 

0 

1 

0 

1 

0 

Figure  2-15.  Word  Counter 


1-45 


3.  Input  Counter  (FCN00-FCN02)  -  The  input  counter  is  used 
to  control  the  timing  of  all  input  operations,  the  resetting  of 
the  instruction  register  at  the  end  of  the  compute  cycle,  and 
the  resetting  of  the  output  counter  at  the  beginning  of  an  out¬ 
put  cycle.  The  count  sequences  are  shown  in  the  timing 
diagrams. 

4.  Instruction  Register  (FCC00-FCC02.)  -  The  instruction 
register  (Figure  2-16)  controls  what  the  computer  is  doing. 
When  it  is  in  state  zero,  ready  for  instruction,  then  an  in¬ 
struction  will  be  set  into  this  register  from  the  input  tape. 
This  will  cause  the  computer  to  go  through  a  cycle  of  one 

or  two  instructions  and  then  return  to  the  "Ready  for  Instruc¬ 
tion"  state. 


State 

" - 

Instruction 

FCC02 

FCC01 

FCCOO 

0 

Ready  for  Instruction 

0 

0 

0 

1 

Fill  Core 

0 

0 

1 

2 

Start  Output  Tape 

0 

1 

0 

3 

Stop  Tapes 

0 

1 

1 

4 

Fill  Drum 

1 

0 

0 

5 

Prepare  to  Fill  Drum 

1 

0 

1 

6 

Set  Address 

1 

1 

0 

7 

Compute 

1 

1 

1 

Figure  2-16  ,  Instruction  Register 


5. 


Ferranti  Flip  Flop  (FCC03)  -  This  flip  flop  is  turned  on  with  the 
master  clock  and  turned  off  with  the  half  clock.  It  is  used  to 
generate  a  Ferranti  write  pattern  for  the  write  amplifiers. 


6.  Start  Control  -  What  is  needed  to  start  the  computer  properly 
is  a  signal  that  is  true  for  one  bit  time  only.  When  the  start 
button  is  depressed  FCC04  is  turned  on.  Then,  when  the  bit 
counter  is  in  state  31,  the  start  signal  (NCC20  and  ICC20)  is 
true.  This  start  signal  starts  the  input  tape  and  resets  the  in¬ 
struction  register,  the  tape-to-core  buffer  register  and  the 
parity  check  flip  flops.  The  computer  then  idles  until  it  re¬ 
ceives  an  instruction  from  the  input  tape. 

Before  the  computer  is  started  the  clear  core  button  must  be 
depressed.  This  will  load  the  core  with  seros  and  set  the  ad¬ 
dress  register  to  word  sero.  To  prevent  tapes  from  starting 
prematurely  the  following  sequence  of  operations  must  be  per¬ 
formed  in  the  correct  order  when  turning  on  the  computer 
power. 

(a)  Press  the  reset  button  on  both  tape  units. 

(b)  Turn  on  the  power. 

(c)  Press  the  halt  buttons. 

(d)  Press  the  start  button  on  both  tape  units. 

7.  Tape  Motion  Control 

a.  Input  Tape  •  The  motion  of  the  input  tape  is  controlled 
by  FCN05.  When  it  is  turned  on  the  tape  will  start  and  when 
it  is  turned  off  the  tape  will  stop.  This  tape  is  always  started 
with  the  start  button.  The  following  conditions  will  stop  the  tape: 

(1)  The  computer  accepts  a  "stop  tapes"  instruction  from 
the  tape. 

(2)  The  halt  buttons  are  depressed. 


11-47 


(3)  A  tape  parity  error  ie  detected  and  the  parity  control 
•witch  ia  aet  to  halt. 

(4)  The  tape  haa  backed  up  to  reread  a  record. 

(5)  The  computer  ia  in  the  aingle  cycle  mode  and  the  core 
ia  filled. 

Output  Tape  -  The  motion  of  the  output  tape  ia  controlled  by 
FCTOO.  When  it  ia  turned  on  the  tape  will  start  and  when  it 
ia  turned  off  the  tape  will  atop.  The  tape  will  start  when  the 
computer  accepts  a  "start  output  tape"  instruction  from  the 
input  tape  or  when  the  start  button  ia  depressed  and  the  tape 
parity  check  flip  flop  (FCN03)  ia  on.  The  following  conditions 
will  stop  the  tape: 

(1)  The  computer  accepts  a  "stop  tapes"  instruction  from  the 
input  tape. 

(2)  The  halt  buttons  are  depressed. 

(3)  A  tape  p*.rity  error  is  detected  and  the  parity  control 
•witch  is  set  to  halt. 

(4)  The  computer  is  in  the  single  cycle  mode  and  the  core 
is  filled. 

At  the  beginning  of  a  problem,  the  output  tape  is  started  under 
control  of  the  input  tape  to  allow  a  3.  5  inch  file  gap  to  be  placed 
on  the  output  tape.  To  do  this  the  tape  parity  check  flip  flop 
must  be  off  when  the  start  button  is  depressed.  If  the  tape 
parity  check  light  is  on,  then  the  reset  error  button  must  be 
depressed  before  the  computer  is  started.  A  tape  parity  error 
will  stop  both  tr.pes.  This  will  prevent  running  out  a  lot  of  out¬ 
put  tape  while  the  input  tape  is  being  backed  up.  When  the 


computer  is  again  started,  both  tapes  must  start  simulta¬ 
neously  so  the  parity  check  circuits  must  not  be  reset  then. 

The  start  button  will  reset  the  tape  parity  check  flip  flop. 

Error  Control 

a.  Tape  Error  -  During  the  " Ready  For  Instruction"  and  "Fill 
Core"  modes  of  the  computer  operation,  information  is  trans¬ 
ferred  from  the  input  tape  to  the  instruction  register  and  the 
core  buffer  unit.  All  circuits  not  directly  involved  in  this 
transfer  of  information  are  idling,  thus  if  a  parity  error  is 
detected  at  this  time,  the  record  in  which  the  error  occurred 
can  be  reread.  The  parity  is  checked  at  the  tape-to-core 
buffer  register  and  if  it  is  incorrect  FCN03  will  be  turned  on 
and  the  tape  parity  light  will  be  turned  on.  If  the  parity  con¬ 
trol  switch  is  set  to  halt,  the  tapes  will  stop.  The  record  can 
then  be  reread  as  follows: 

( 1)  Set  the  tape  direction  control  switch  to  reverse  and  press 
the  start  button.  This  will  back  up  the  input  tape  to  the 
beginning  of  the  record. 

(I)  Press  the  clear  core  button.  This  will  prepare  the  core 
to  be  filled  from  word  aero. 

(3)  Set  the  tape  direction  control  to  forward  and  press  the 
start  button.  This  will  start  both  tapes  and  reset  the 
parity  check  flip  flop. 

The  reset  parity  button  must  not  be  depressed  as  this  will  pre¬ 
vent  the  output  tape  from  starting.  If  the  parity  is  reset  then 
it  can  be  turned  on  again  with  the  test  parity  button,  and  must 
be  done  before  the  computer  is  started. 

If  the  parity  control  switch  is  set  to  ignore  then  the  tape  parity 
light  will  turn  on  but  the  computer  will  not  stop. 


n-49 


b.  Core  Error  -  During  the  "Fill  Drum"  ,  "Set  Addreae"  ,  and 
"Compute"  instructions  parity  is  checked  at  the  core  buffer 
register.  The  register  is  set  with  PCX06  and  immediately 
starts  shifting,  so  the  parity  must  be  checked  the  bit  time 
after  it  is  set. 

PCX06  turns  on  FCN06  which  turns  itself  off  the  following  bit 
time.  The  core  parity  flip  flop  (FCN04)  will  be  turned  on  if 
FCC02  (correct  instruction),  and  FCN06  are  both  on  and 
there  is  an  even  parity.  FCN03  turns  on  the  core  parity 
light,  but  the  computer  continues  to  run.  The  light  can  be 
turned  off  with  the  reset  parity  button. 

Output  Write  Control  -  Three  flip  flops  control  the  write  cycle  to 
the  output  tape.  FCT01  inhibits  the  tape  write  pulse  when  it  is  on. 
FCT02  and  FCT03  control  the  lengths  of  the  record  gap,  output 
data  and  output  excess  cycles.  This  is  shown  in  Figure  2-17. 
Information  is  transferred  from  the  output  register,  FTT00-FTT05 
and  DTP02,  through  their  line  drivers  onto  the  output  tape  when¬ 
ever  there  is  a  tape  write  pulse  (LCX36).  This  is  an  eight  micro¬ 
second  pulse  during  bits  4-5  and  20-21  when  the  IBM  727  tape  unit 
is  used.  The  IBM  729  tape  unit  requires  four  microsecond  pulses 
and  they  will  occur  at  bit  time  four  and  bit  time  twenty.  The  write 
check  character  pulse  tells  the  tape  unit  to  record  the  longitudinal 
parity  character.  For  the  IBM  727  tape  unit  this  is  a  16  micro¬ 
second  pulse  (LCX37),  and  for  the  IBM  729  tape  unit  the  signal  is 
the  turn  off  of  a  flip  flop  (FCT04)  which  was  turned  on  by  the  first 
character  of  the  record.  The  check  character  is  recorded  four 
character  spaces  after  the  last  character  of  the  record. 


II  H 

«)  3 
u  u 


II  H 

m  * 

u  u 


■I 

« 

u 


* 

u 


M  ^ 

"I  M 

II  II 

IQ  ? 
O  U 


•a  pi 
ii  n 

n  * 

u  u 


•i  n 

to  * 

u  u 


Figure  2-17.  Output  Cycle 


H.  COMMANDS 


1.  Ready  for  Instruction  -  The  computer  accepts  the  next  character 
on  the  input  tape  as  an  instruction,  providing  the  character  has  a 
zero  in  channel  eight.  The  rest  of  the  computer  idles,  See  Figure 
2-18. 

2.  Fill  Core  -  The  information  on  the  input  tape  is  routed  through  the 
tape-to-core  buffer  register  into  the  core  buffer  unit.  The  core 
will  accept  193  characters  and  then  signal  the  computer  that  it  is 
filled.  The  instruction  register  will  then  return  to  a  ready  for 
instruction  configuration.  See  Figure  2*19. 

3.  Start  Output  Tape  -  When  the  instruction  register  is  set  to  "Start 
Output  Tape,"  the  output  tape  motion  control  flip  flop  (FCTOO)  is 
turned  on  and  the  output  tape  write  control  flip  flop  (FCT01)  is 
turned  on,  starting  the  tape  and  inhibiting  the  write  pulse.  The 
output  counter  starts  its  cycle  when  the  tape  mark  is  recognised 
on  the  input  tape.  The  computer  will  accept  instructions  from  the 
input  tape  when  the  instruction  register  is  in  this  configuration. 

4.  Stop  Tapes  -  This  instruction  indicates  the  end  of  a  tape.  It  stops 
both  tape  units  and  turns  on  the  end  of  tape  light  on  the  control 
panel. 

5.  Fill  Drum  -  The  instruction  on  the  tape  is  "Prepare  To  Fill  Drum.  " 
This  instruction  synchronizes  the  input  counter  to  the  bit  counter 
and  word  counter  for  the  fill  drum  operation.  At  bit  24  of  word  26 
the  input  counter  is  set  to  three.  Then  on  bit  31  and  with  the  input 
counter  again  at  three,  the  instruction  register  changes  from 
prepare  to  fill  drum  to  fill  drum.  The  drum  lines  now  change  from 
individual  recirculation  to  a  serial  connection  as  shown  in  Figure 
2-20.  A  total  of  144  characters  are  put  on  the  drum  from  the  core 


II- 62 


06 

W 

H 


S3 

O 


sea 

SS5 


o  r»  «o 


11-53 


Figure  2-18.  Reedy  for  Instruction 


o 


II-  54 


Figure  2-19.  Fill  Core 


buffer  unit.  This  is  the  amount  of  information  on  a  27-word  drum 
line.  Six  fill  drum  instruction  cycles  are  needed  to  completely  fill 
the  drum.  See  Figure  2-21. 

6.  Set  Address  -  The  set  address  instruction  transfers  the  address 
of  an  iteration  from  the  core  buffer  unit  to  the  one -word  output 
line.  This  consists  of  four  characters  and  upon  completion  the 
instruction  register  changes  to  the  compute  configuration.  See 
Figure  2-22. 

7.  Compute  -  The  compute  instruction  is  started  automatically  at  the 
end  of  the  set  address  instruction.  During  this  cycle  of  27  word 
times  the  data  is  read  into  the  input  summing  registers  and  sent 
through  the  computer.  The  instruction  register  returns  to  the  ready 
for  instruction  configuration  at  the  end  of  the  cycle.  See  Figure 
2-23. 

L  TAPE  FORMAT 

1.  Input  Tape  -  The  tape  must  have  a  3.  5  inch  file  gap  before  the  be¬ 
ginning  of  any  recorded  information.  At  the  end  of  the  file  gap  is 
a  tape  mark  and  its  associated  check  character. 

During  word  times  0-5  of  a  compute  cycle,  the  multiplicand  register 
holds  Wl.  This  is  entered  into  the  register  during  the  last  word 
of  the  previous  compute  cycle,  so  the  register  must  be  preset  be¬ 
fore  the  first  cycle.  This  is  accomplished  by  filling  the  input 
accumulator  drum  lines  and  going  through  a  dummy  compute  cycle. 
The  correct  value  will  be  entered  into  the  multiplicand  register  if 
•Wl  is  filled  on  the  drum  to  read  from  FAR02  at  word  time  25,  and 
the  data  during  the  dummy  compute  cycle  is  all  seros.  This  is 
shown  in  Figure  2-24. 


11-55 


After  the  multiplicand  register  ie  preset  the  drum  must  be  filled 
with  the  initial  conditions.  This  requires  six  records,  each  re¬ 
cord  having  enough  information  to  fill  a  twenty-seven  word  line. 

The  record  consists  of  a  fill  core  character,  followed  by  144 
data  characters,  49  excess  characters  which  are  needed  to  com¬ 
pletely  fill  the  core  buffer  unit  but  are  not  entered  into  the  com¬ 
puter,  three  excess  characters  which  cannot  look  like  instructions, 
and  finally  a  fill  drum  character.  The  last  three  excess  characters 
are  necessary  to  fill  the  record  out  to  a  multiple  of  six  characters 
which  is  an  IBM  word.  The  drum  line  interconnections  during  the 
fill  drum  mode  is  shown  in  Figure  2-20.  The  layout  of  a  fill  drum 
record  is  shown  in  Figure  2^25,. 

After  the  drum  is  filled,  the  output  tape  must  be  started.  A  six- 
character  record  is  used  to  do  this  and  it  is  shown  in  Figure  2*26. 
Following  this  short  record  is  a  file  gap  instead  of  the  normal 
record  gap.  This  is  used  to  provide  the  output  tape  with  a  file  gap. 

The  computer  is  now  ready  to  accept  input  data  and  compute.  One 
record  of  198  characters  as  shown  in  Figure  2*28  is  used  for  each 
iteration.  The  first  character  is  a  fill  core  instruction.  This  is 
followed  by  193  characters  of  data,  ordered  as  shown  in  Figure  2-27. 
After  the  data  the  set  address  instruction  is  used  to  start  the  com¬ 
pute  cycle.  Three  excess  characters  are  used  between  the  last 
data  character  and  the  set  address  character  so  that  the  record 
will  consist  of  an  integral  number  of  IBM  words.  The  first  char¬ 
acter  of  the  last  record  is  a  stop  tape  instruction.  This  will  stop 
both  tapes  and  turn  on  the  end  of  taps  light. 


2. 


Output  Tape  -  The  output  tape  is  started  with  an  instruction  on 
the  input  tape,  and  then  nothing  is  written  on  the  tape  until  the 
tape  mark  on  the  input  tape  starts  the  write  cycle.  No  tape  mark 
is  recorded  on  the  output  tape. 

The  record  consists  of  excess  information  followed  by  the  output 
data.  The  excess  information  is  zeros  in  channels  1,  2,  4,  8, 

A  and  B,  and  a  one  in  channel  C.  The  last  three  characters  of  the 
excess  information  will  have  a  one  in  channel  B  and  zeros  in  the 
other  channels.  The  output  data  is  a  total  of  108  characters  in 
length,  but  only  76  characters  contain  output  information.  The 
output  record  is  shown  in  Figure  2-28  and  Figure  2*29. 

J.  LABORATORY  MODELS  OF  HSDDA  -  Photographs  of  the  laboratory 
models  of  the  HSDDA  Computer  and  the  manual  control  unit  for  the  computer 
are  reproduced  as  Figures  2*30  and  2-3 1. 


II- 57 


12-Si 


Figure  2-20.  Serial  Drum  Connections  for  Drum  Fill. 


n-4o 


Figure  2-22.  Set  Address 


o  o  o 


© 

,1  s  " 


>o  >o 

—  N 


IA  sO 


©*  * 
fM  C  ^ 


to  « 

M  0  © 


N 

ri  *  f- 


«e  • 

—  e  r* 


2  * 

fl 


p«  ^ 

m  n 


o  * 

«n  N 


^  ^  M 
N  N 


9^0 
**  *• 


■ 


r-  <© 

N  N  ^ 


«  *o 

N  M 


«  *  Z 

u  o  u 


L|*JW character*  fills  core 
—  IfD  character •  «  33 1 BM  words  — 

Repeating  this  cycle  6  times  will  completely  — 
fill  drum 


Record 

Record 

Recdrd 

Record 

Record 

Record 

Word* 

No.  1 

No.  2 

No. 3 

No.  4 

No.  5 

No.  6 

0 

X 

U13(n»l) 

0 

W2U21(n-2) 

W2U21(n>l) 

X 

1 

X 

U23(n-1) 

0 

W2U31(n-2) 

W2U31{n-l) 

X 

2 

X 

U33(n- 1) 

0 

W2U13(n -2) 

W2U13(n-l) 

X 

3 

X 

X 

0 

W2U23(n-2) 

W2U23(n>l) 

W3 

4 

X 

X 

0 

W2U33(n -2) 

W2U33(n-l) 

X 

S 

X 

X 

0 

W3U12(n>2) 

W3U12(n-l) 

X 

6 

X 

Ull(n-l) 

0 

W3U22(n>2) 

W3U22(n- 1) 

X 

7 

X 

U21(n-1) 

0 

W3U22(n-2) 

W3U32(n- 1) 

X 

8 

X 

U31(n- 1) 

0 

W3Ull(n>2) 

W3Ull(n-l) 

X 

9 

X 

U13(n-1) 

0 

W3U21(n-2) 

W3U21(n- 1) 

X 

10 

X 

U23(n- 1) 

0 

W3U31(n-2) 

W3U31(n- 1) 

A3 

11 

X 

U33(n- 1) 

0 

0 

0 

A2 

12 

X 

U12(n-1) 

0 

0 

0 

X 

13 

X 

U33(n- 1) 

0 

0 

0 

X 

14 

X 

U32(n- 1) 

0 

0 

0 

X 

15 

X 

Ull(n-l) 

0 

0 

.0 

A1 

16 

X 

U21(n-1) 

0 

0 

0 

X 

17 

X 

U31(n-1) 

0 

0 

0 

X 

18 

X 

U13(n-1) 

0 

0 

0 

X 

19 

X 

U23(n- 1 ) 

0 

0 

0 

W1 

20 

U33(n- 1) 

X 

WlU13(n-2) 

WlU13(n-l) 

X 

X 

21 

U12(n- 1) 

AU12(n-l) 

WlU23(n-2) 

WlU23(n- 1) 

X 

1A1 

22 

U22(n- 1) 

AU22(n-l) 

WlU33(n-2) 

WlU33(n-l) 

X 

3A3 

23 

U32(n- 1) 

AU32(n-l) 

WlU12(n-2) 

WlU12(n- 1) 

W  2 

3A2 

24 

Ull(n-l) 

0 

WlU22(n-2) 

WlU22(n- 1) 

X 

5W3 

25 

U21(n-1) 

0 

WlU32(n-2) 

WlU32(n- 1) 

X 

X 

26 

U31(n- 1) 

0 

W2U1  l(n-2) 

W2Ull(n-l) 

X 

8W2 

*32  bits  each 
x  -  don't  care 

|  (n)  *  result  of  nth  iteration 

Figure  2*25,  Drum  Fill  Information 


U-63 


11-64 


Record  ■  198  Characters  ■  33  IBM  Words 


Channels 


Word 

Characters 

1.2,4 

8.  A,  B 

0 

1-7 

W3 

A1 

1 

8-14 

Wl* 

A3 

2 

15-21 

W  2 

A2 

3 

22-28 

W3 

A1 

4 

29-35 

Wl 

A3 

5 

36-42 

W2* 

A2 

6 

43-49 

W3 

A1 

7 

50-56 

Wl 

A3 

8 

57-63 

W2 

A2  | 

9 

64-70 

W3 

A1 

10 

71-77 

Wl 

A3 

11 

78-84 

W2 

A2 

12 

85-91 

W3* 

A1 

13 

92-98 

Wl 

A3 

14 

99-105 

W2 

A2 

15 

106-112 

W3 

A1 

16 

113-119 

Wl 

A3 

17 

120-126 

W2 

A2 

!  18 

127-133 

W3 

A1 

19 

134-140 

Wl 

A3* 

20 

141-147 

W  2 

A2* 

21 

148-154 

W3 

A1 

22 

155-161 

Wl 

A3 

23 

162-168 

W2 

A2 

24 

169-175 

W3 

Al* 

25 

176-182 

Wl 

A3 

26 

183-189 

W2 

A2 

*  These  values  correspond  in  time. 

Figure  2- 27  Input  Data  Organization 


Figure  <0.  L.«bwr.itor\  ut  thr  MSDDA  Computer 


Il-t't* 


11-60 


hlKun-  z-tt.  M.in.i.il  Control  Unit  lor  the  I ISDDA  Cornput 


2.  10  STRAP-DOWN  PROCESSOR  CHECKOUT  AND  PERFORMANCE 
EVALUATION 

A.  General  Feature*  of  Checkout  and  Performance  Evaluation 
Programs  and  Tapes  -  Operation  of  the  breadboard  HSDDA 
was  verified  to  be  in  accordance  with  the  logical  design  de¬ 
scribed.  Test  tape  programs  and  preparation  for  evaluation  of 
the  HSDDA  in  the  stra]>down  function  was  not  fully  debugged  within 
the  program  effort  period.  It  is  recommended  that  a  brief  further 
effort  be  provided  to  evaluate  HSDDA  performance.  For  purposes 
of  clarity  Figure  2-32  presents  the  total  flow  of  information  pro¬ 
cessing  required  for  final  evaluation  and  demonstration  of  the 
high  speed  DDA  hardware  and  in  block  diagram  form  indicates 
the  requirements  for  the  five  704  programs  which  are  described 
below. 


Ra  ** 
Data 


Error 

Analysis 

Report 


Figure  2-32.  Information  Processing  for  Evaluation  of  High 
Speed  DDA  Hardware. 


11-70 


B.  IBM  704  Program* 


1.  The  A  Program  wai  planned  to  process  raw  data  provided 
by  WADD  into  a  form  compatible  with  the  instrumentation 
assumptions  made  in  establishing  the  input  characteristics 
of  the  high  speed  DDA.  This  program  accounts  for  data 
sample  rates,  scale  factors,  resolutions,  phase  shifts, 
etc. ,  which  may  be  inherent  in  the  raw  data  and  which 
must  be  compensated  for  before  the  data  can  be  assimi¬ 
lated  by  the  HSDDA.  This  program  was  never  written,  since 
no  raw  data  was  provided  to  Litton  by  WADD. 

2.  The  B  Program  accepts  data  in  proper  analytic  form,  and 
provides  required  phase  shifts,  formatting,  and  control 
signal  insertion.  The  output  of  this  program  provides 
tape  which  is  directly  acceptable  by  the  HSDDA. 

3.  The  C  Program  accepts  a  tape  produced  by  the  B  Program  and 
provides  a  higher  accuracy  integration  process  than  that  used  in 
the  HSDDA  to  produce  a  set  of  "yardstick"  computations  used 
for  performance  evaluation  of  the  HSDDA.  Care  is  taken  that 
the  phase  shifts  introduced  in  the  B  Program  for  HSDDA  input 
are  properly  compensated  for  in  the  C  Program. 

4.  The  D  Program  accepts  the  output  tape  from  the  HSDDA  and 
the  "yardstick"  computations  derived  as  an  output  tape  from 
the  C  Program,  and  provides  an  error  analysis  of  HSDDA 
performance. 

5.  See  C.  below. 


U-71 


Testa  Planned  For  Strap-down  Processor  Evaluation 


1.  Functional  Tests-  The  functional  tests  were  designed  to 
prove  that  the  hardware  was  operating  in  the  manner  that 

the  logical  designer  had  intended  it  to  operate.  Magnetic  tapes 
for  these  tests  required  the  B  Program. 

2.  Mechanization  Tests  -  The  mechanization  tests  were  de¬ 
signed  to  prove  that  the  arithmetic  operations  built  into 
the  HSDDA  had  indeed  provided  a  valid  mechanization  of 
the  integration  algorithms  which  the  machineeas  supposed  to 
contain.  These  tests  required  the  existence  of  the  B  Program 
and  also  required  hand-computed  check  values  which  were 
computed  on  the  basis  of  the  fundamental  algorithms  of  the 
machine,  but  without  utilizing  the  machine  mechanization. 

3.  Algorithm  Validity  Tests  -  These  tests  were  designed  to 
prove  the  validity  of  the  algorithms  which  had  been  mechanized 
in  the  HSDDA.  They  were  based  on  the  generation  of  an  input 
tape  which  simulates  a  Dutch  roll  environment.  The  Dutch 
roll  is  chosen  because  the  closed  analytic  solution  is  known, 

so  that  arbitrary  check  point  computations  are  readily  ob¬ 
tainable.  The  first  phase  of  these  tests  showed  that  it  is 
possible  to  demonstrate  HSDDA  performance  through  the  use 
of  check  points  hand-computed  from  the  analytic  solution. 
Through  the  proper  choice  of  amplitude,  frequency,  and  phase 
of  the  Dutch  roll  components  it  should  be  possible  to  make 
direct  visual  comparison  between  two  iterations,  separated 
by  a  specific  number  of  iterations,  without  having  to  go  to  any 
hand  calculations.  This  effort  requires  Program  E,  the 


D. 


Dutch  Roll  Generating  Program.  Programs  C  and  D  were 
completed  so  that  the  Dutch  Roll  Program  could  be  played 
through  the  high  accuracy  integration  program  and  per¬ 
form  error  analysis.  Then  there  would  be  comparative 
results  of  the  HSDDA,  the  "yardstick"  calculations,  and 
the  theoretical  analytic  solution. 

Real  Data  Tests  -  These  tests  were  designed  to  demonstrate 
machine  performance  when  operating  on  real  data  which  was  to 
be  provided  by  WADD.  These  tests  were  not  executed  for  the 
reasons  described  above. 


11-73 


2.  1 1  FUNCTIONAL  AND  MECHANIZATION  TESTS  OF  THE  STRAP-DOWN 
PROCESSOR  -  The  earliest  stage  of  testing  of  the  computer  sought  to  verify 
the  final  mechanisation  operation  with  regard  to  the  logical  designer's  intent  of  basic 
logical  operation  and  the  system  analyst's  intent  of  arithmetic  processing  by  the 
computer.  The  first  of  these  tests  were  of  an  elementary  nature  consisting  of 
testing  the  response  to  streams  of  alternating  l's  and  0's  for  selected  inputs  and 
debugging  on  this  level.  In  order  to  confirm  the  arithmetic  processing  consistency 
of  the  machine  for  a  desired  function  with  a  high  level  of  confidence,  a  set  of  rela¬ 
tively  complicated  inputs  were  selected  for  which  the  exact  desired  outputs  could 
be  hand  computed  by  the  system  analyst  over  several  iterations.  To  verify  file 
second  order  algorithm,  or  any  component,  requires  at  least  3  iterations  on  an 
input  variable  of  at  least  the  complexity  of  a  quadratic  function.  To  verify  com¬ 
munication  the  quadratic  inputs  should  be  distinct  for  each  input  variable.  The 
following  quadratic  family  was  constructed  on  this  basis: 

IM  -  Ml*  •  [l ♦  (.1>*W2-|7— ^ 

where  •  >  1,  2,  3  for  coordinates  1,  2,  3 

u  «  0,  1  for  acceleration  or  angular  rate  input 
w  •  word  time  of  input 

Since  inputs  are  sampled  at  a  high  rate,  and  preprocessed  by  summation 
of  9  values,  the  band  computations  are  simplified  by  an  analytic  evaluation  of 
file  preprocessing  output  which  is 

where  v*-  *  2*w+^1 

e*-  -  27a  -  15  ♦  a#,“ 
a 


n-74 


where  a“‘ W  is  the  word  time  at  which  the  input  is  abeorbed  in  the  computer, 
(deducible  from  the  pertinent  table  in  the  logical  design  description).  The  input 
accumulator  outputs  are  then  quire  simply  computed,  rounded  off,  and  multiplied 
by  the  proper  U  value  of  the  set  specified  for  tape  preparation  in  the  test.  The  U 
values  used  were  arbitrary,  non-trivial,  binary  numbers.  The  products  obtained 
by  an  octal  desk  computer  were  rounded  off  and  put  through  the  desired  extra- 
polator  computation  to  yield  the  integral  increment  for  each  variable,  debugging 
of  the  computer  proceeded  on  the  first  iteration.  When  all  desired  results  were 
obtained  exactly,  the  next  iterations  were  hand  computed,  to  verify  the  exact 
computation  of  results  by  the  strap-down  processor  as  being  for  the  above  inputs. 


u  Jt“  «  u  if*  ♦  Au 


„  where  j,  t  *  1,  2,  3;  w  *  1,  0 


Au 


jtw  m 


^jrsw 

a 

a,  w 


«r] 


JL 

.  R 


*  precision  round-off  to  19  bits  k  sq 


»  9A  J*,W-  AJ*\W 


*  2  where  w  «  27(n-l)  +  3k  +  o-' w 

▼1  k»0 


A.  Algorithm  Validity  Tests. 

1.  Introduction  -  The  accuracy  of  the  strap-down  processor  for 
a  given  set  of  input  functions  is  determined»in  practice, by 
comparison  of  processor  outputs  with  other  computed  values 
which  insofar  as  possible  are  nearly  exact  values  or  at  least 
of  significantly  higher  accuracy  than  can  be  expected  of  the 
processor.  Two  approaches  enable  computation  of  "  exact 
or  "  yardstick''  solutions  for  certain  output  types  and  have 
valuable  coordinated  uses.  In  the  case  of  simple  Dutch  roll 
motion  generated  by  inputs  which  are  expressable  as  analytic 
functions  of  the  type  presented  in  Chapter  6  Section  4  of  the 

B-7S 


Phase  I  report,  the  exact  inertial  reference  ia  expreased  aa  an  explicit  function 
ao  that  evaluation  of  the  processor  can  be  aa  exact  as  desired  at  any  point,  by 
hand  or  machine  computation,  without  using  incremental  computation  techniques 
subject  to  error  build-up.  This  approach  is  the  most  incontrovertible  for  the 
limited  class  of  inputs  for  which  it  is  feasible.  The  second  approach  is  the 
general  solution  by  incremental  computation  of  very  high  accuracy  which  in 
principle  can  be  attained  because  numerical  integration  with  any  nth  order 
algorithm  (n  >  0)  has  accuracy  which  generally  increases  with  decreased 
iteration  interval.  The  second  approach  however,  is  subject  to  accuracy 
limitations  because  the  actual  computation  is  not  performed  with  whole  numbers 
of  unlimited  accuracy  as  required  by  theoretical  considerations,  but  with  a 
whole  number  of  limited  bit  length.  There  are  other  practical  problems  such  as 
increase  in  costs  resulting  from  the  decreased  iteration  interval.  The  next 
section  of  this  report  presents  the  ECD  programs  which  enable  evaluation  of 
the  processor  for  Dutch  roll  and  provide  a  general  "yardstick"  incremental 
computation.  The  validity  of  the  latter  should  be  checked  by  comparison  of  the 
exact  analytic  solution  for  Dutch  roll.  Provided  the  comparison  is  found  suf¬ 
ficiently  close  compared  to  processor  output,  it  can  be  assumed  that  round-off 
and  algorithm  error  in  the  C  Program  is  not  a  problem  for  low  frequency  var¬ 
iables.  The  sensitivity  to  noise  of  the  yardstick  computation  can  probably  be 
evaluated  analytically  on  the  one  hand  and  by  yardstick  program  runs  with  con¬ 
tracted  or  expanded  iteration  Interval  on  the  other.  If  it  is  found  that  two  widely 
different  but  small  iteration  intervals  yield  substant  ally  the  same  results  the 
really  general  accuracy  of  the  proposed  yardstick  computation  program  is 
assured  with  respect  to  random  rather  than  bias  errors.  These  latter  analyses 
were  not  performed  during  this  program. 


11-76 


2.  12  THE  DUTCH  ROLL  INPUT,  GENERAL  YARDSTICK  COMPUTATION, 
AND  PROCESSOR -ERROR  EVALUATION  (ECD)  PROGRAMS. 

A.  General  Description  -  The  following  is  a  description  and  listing 
of  the  IBM  704  ECD  Program  for  the  High  Speed  DDA.  The 
ECD  Program  simulates  the  HSDDA  and  compares  the  simulation 
with  the  true  output  of  the  HSDDA. 


The  E  section  of  the  ECD  Program  generates  six  inputs. 
Case  I  -  Wj  *  A  sin  (S^utn^wt  +  0) 

w2  *  A  cos  (S^utn^t  +  9) 


(11-38) 


u>3  »  0 


Al*° 

a2.o 

A3  *  0 
Case  II  -  Wj  *  0 

u»2  *  A  sin  (Sg^wtn^wt  +  0) 
w3  *  A  cos  (0Q3wtnjwt  +  «) 

Al*0 


(U- 39) 


Aj,  «  0 

A,.0 

Case  III  -  Wj  ■  A  cos  (A^wtn^wt  +  0) 

w2  ■  0 

w3  «  A  sin  (Ogjwtnjwt  ♦  8) 


(II -40) 


11-77 


Cut  III  . 
(cont) 


The  values  for  the  constants  and  variables  are: 

0O3ut  *  U.  0004882812694 

A  «  -0.  6666663342844 

B  «  -0.  002502441124 

n^ut  ■  A  positive  integer  starting  from 
aero  and  increasing  by  one  (1) 
every  iteration  of  the  704  pro¬ 
gram. 

It  will  be  noticed  that  a  sine  and  cosine  polynomial  has  to  be  generated 
every  iteration.  Since  thousands  of  iterations  will  be  performed, 
the  above  equations  will  be  generated  once  every  50  iterations. 

During  the  intervening  iterations,  die  following  equations  are  used 
for  the  sine  and  cosine: 


Ai-0 

A2“° 


A3-0 


S  -  S  .  Cos  8n,ut  ♦  C  .  Sin  ®„wt 
n  n- 1  03  n- 1  03 


<n-41) 


C  -  C  .  Cos  ®  wt  -  S  ,  Sin  P_,wt 
n  n- 1  03  n- 1  03 


(11-42) 


Where  S  and  C  are  the  sine  and  cosine  for  the  present  iteration, 
n  n 

S  .  and  C  .  are  die  sine  and  cosine  for  die  previous  iteration, 
n- 1  n- 1 


Cos  903ut  >  0.  9999998809 
Sin  *  2”  ^ 


U-78 


Since  there  le  a  possibility  the"  the  argument  for  the  sine  and 
cosine  can  become  increasingly  large  and  cause  overflow  due  to 
the  variable  n^ut,  the  argument  is  tested  against  tt/2.  When  it 
exceeds  0/2,  the  excess  becomes  the  new  argument.  Consequently, 
there  will  also  be  a  change  in  the  signs  of  the  sine  and  cosine. 

The  C  section  of  the  ECD  Program  receives  its  inputs  from  the  E 
Section  and  evaluates  the  following  set  of  equations  which  are  also 
solved  by  the  High  Speed  DDA. 


AV*j  - 

a  U  +  a  U  +aU 
l  j.1  ‘  j.2  aJUj.J 

j  * 

(n-43) 

dVj  * 

Q  AV*  AV* 

11  >n  12  J.-1  12  J.-2 

j  » 

1..3 

(II-44) 

4UV 

m  “3UJ.J  *  “2UJ.3 

j« 

1..3 

(H-45) 

40V 

-  ",Uj,3  -"jUj,, 

j« 

1..3 

(U-46) 

4°V 

*  “2Uj.l  '  ”1UJ.2 

j  * 

1..3 

(U-47) 

40  M 

j  * 

1..3 

(11-48) 

V  ■ 

J  » 

A  •  # 

(11-49) 

°j.»  ■ 

~*U*  -  — du*  ♦  —  su* 

12*Uj.J  12  “U  J.3n.,  12  J.l„j 

i  ’ 

*  #  •  *r 

(11-50) 

*Vi 

~  UJ.lt4UJ.. 

'  UJ.2t4UJ.2  >  ■ 

(11-51) 

'Vj 

- 

•Ujy  «  new  value  of 


11-79 


Program  Execution  -  The  U.  array  ie  initialised  before  any 

J  *  * 

computations  are  made.  Then  control  numbers  for  a  tape  input 
•  mode  or  case  selection  (1,  2,  3)  for  input  of  Wg  are  zero.  The 
case  selection  refers  to  cases  1.  11,  III  for  selection  of  the  order 
of  sines  and  cosines  (1.  e. ,  described  previously.  The 

calculations  are  performed  in  double  precision  floating  point  except 
for  the  sines  and  cosines  generated  by  the  Dutch  roll  routine.  The 
fixed  point  values  of  the  Dutch  roll  (w^,  «^)  are  floated  and  used 

in  the  double  precision  calculations.  The  output  quantities  are  fixed 
and  sent  to  an  output  area  where  they  are  printed  in  the  D  Program. 

The  D  section  of  the  ECD  Program  prints  and/or  compares  the 
output  from  the  High  Speed  DDA  and  the  output  from  the  C  section 
of  the  program.  The  D  section  of  the  program  for  the  High  Speed 
DDA  has  three  modes  of  operation: 

1.  Comparison  of  the  704  tape  (output  of  the  C  Program)  and  the 
High  Speed  DDA  tape  (output  of  the  High  Speed  DDA)  and  listing 
of  the  corresponding  elements  of  each  matrix,  and  the  difference 
between  the  corresponding  elements  in  decimals. 

2.  Listing  of  the  elements  of  the  High  Speed  DDA  tape  in  both 
octal  and  decimal. 

3.  Listing  of  the  elements  of  the  704  tape  in  decimal. 

The  interval  between  printing  and  the  number  of  consecutive  records 
that  are  printed  can  be  selected.  A  general  flow  chart  of  the  D 
section  of  the  program  is  presented  in  Figure  2-33, 


Does  3  time* 


Figure  l  >33 .  Qeneral  Flow  Chart 


U-lt 


c. 


Control  Cards  -  Placed  After  Transfer  Card. 


-  Mode 

-  Compares  DOA  tape  and  DDA  simulation 
program  and  prints  difference 

-  Simulates  DOA  and  prints  result 

-  Prints  DDA  tape  (on  Unit  no.  4) 

Starts  in  Column  2  -  Skip  interval  (in  decimal)  must  be  followed 

by  a  comma 

Next  Column  -  Print  interval  (in  decimal)  must  be  followed 

by  a  comma 

Next  Column  -  Maximum  number  of  records  (in  decimal j  to 

process  next  column  must  be  blank 

Example:  Column  -123456789  10  11  12  . 

Punch  •  1  5  1  i  9  ,  804 

Mode  «  1,  (which  compares  DDA  tape  and  DDA  simulation  and 
prints  the  difference) 

Skip  Interval  «  58,  (after  every  58  iterations,  9  consecutive  itera¬ 
tions  will  be  printed) 

Print  Interval  ■  9,  (9  consecutive  iterations  will  be  printed) 
Maximum  Number  of  Records  *  804,  (after  processing  804  records 

"iterations"  the  program  will  halt) 


Card  1: 

Column  1 
1 

2 

3t 


fWhen  using  Mode  3,  no  more  control  cards  are  necessary. 


U-82 


Card  2; 


Column  1  -  Taps  Number  (A-Program  input,  not  used 

yet) 

Column  2  -  Case  Number  for  Dutch  roll,  a  1,  2,  or  3 

must  be  punched  in  this  column 

Column  3  -  Must  be  a  comma 

Starts  in  Column  4  -  Identification  Number* (in  decimal) 

column  following  must  be  a  comma 

Next  Column  -  iteration  Number*(in  decimal)  column 

following  must  be  blank 

Example:  Columa  -123456789  10  11  12 
Punch  -11.  3  ,  18724 

Tape  Number  *  1  (not  used)  can  be  anything 

Caee  Number  *  1 

Identification  Number  *  3 

Iteration  Number  *  18724^  (DDA  iteration  number  must  be  equal 
to  44444g  in  octal) 

Control  Cards  3-35:  The  following  33  cards  are  for  initialisation 

and  each  card  must  be  included  even  if  the  value  is  sero. 

Column  1  -  Blank 


♦When  mode  number  1  is  used  (control  card  1),  the  identification  of  the 
DDA  tape  (1st  record)  and  the  identification  punched  in  control  card  2  must 
be  identical. 

*  When  mode  number  1  is  used  (control  card  1),  the  iteration  number  on  the 
DDA  tape  (record  2)  and  the  iteration  number  punched  in  control  card  2  muat 
be  identical. 


D-83 


Column  2  -  If  the  eign  of  the  decimal  value  ie  negative,  then  a 

(-)  eign  muet  be  placed  in  thie  column  and  a  decimal 
point  (. )  ie  placed  in  Column  3 

If  the  eign  of  the  decimal  value  ie  poeitive,  the  decimal 
point  (. )  ie  punched  in  thie  column 

Column  3  or  4  -  The  decimal  value  atarta  in  thie  column.  The 

decimal  exponent  follows  immediately  after  the  char¬ 
acter  E  (which  follows  immediately  after  the  laet 
digit  of  the  decimal  value).  If  the  character  E  does 
not  appear,  the  exponent  ie  aeaumed  to  be  aero. 


Example:  Column  -123456789  10 

-  .  0  3  4  6  2 
.  3  4  6  1  E  -  1 

(.  3462E- 1  ie  the  same  ae  .  03462) 


Card 

3  -  AV* 

1 

n-1 

Card 

4  -  AV* 

2 

n- 1 

Card 

5  -  AV* 

3 

n-1 

Card 

6  -  AV* 

1 

a- 2 

Card 

7  -  AV* 

2 

n-2 

Card 

• 

> 

i 

3 

a-2 

Card 

9  -  AU* 

11 

n-1 

Card 

10  -  AU* 

12 

n- 1 

Card 

11  -  AU* 

13 

n-1 

Card 

12  -  AU* 

21 

n-1 

Card 

13  -  Au* 

22 

n-1 

Card 

14  -  ^U* 

23 

a-1 

Card  15  -  AU*  31  a- 
Card  16  -  AU*  32  n- 
Card  17  -  AU*  33  a- 
Card  18  -  AU*  11  a-; 
Card  19  -  AU*  12  a-. 
Card  20  -  AU41  13  a-i 
Card  21  -  AU*  21  a-. 
Card  22  -  Au*  22  n- 
Card  23  -  Au*  23  n- 
Card  24  -  Au*  31  n- 
Card  25  -  Au*  32  n- 
Card  26  -  AU*  33  n- 


11-84 


Card  27  -  U  1 1 
Card  28  -  U  12 
Card  29  -  U  13 
Card  30  -  U  21 
Card  31  -  U  22 


Card  32  -  U  23 
Card  33  -  U  31 
Card  34  -  U  32 
Card  35  -  U  33 


D.  Operating  Procedures.  (Tapes) 

Unit  Purpose 

4  High  Speed  DDA  tape 

6  Off  line  output  if  sense  switch  2  is  OFF 

(all  others  not  used) 


Sense 

Switches 

1  ON 

2  ON  for  on  line  printing 

OFF  for  off  line  printing  to  go  on 
Tape  Unit  no.  6 

3  OFF 

4  OFF 

5  OFF 

6  OFF 


The  SHARE  2  Printer  board  is  used.  The  Program  does  not 
rewind  any  tapes. 

E.  Programmed  Halts. 

33143  -  Control  cards  missing  (Card  no.  1) 

33144  -  Control  cards  incorrect  (Bad  punch)  (Card  no.  1) 

34032  -  Control  cards  noising  (Card  no.  2) 

34033  -  Control  cards  incorrect  (Bad  punch)  (Card  no.  2) 


H-8S 


34051  -  Control  cards  missing  (Cards  3-8) 

34052  -  Control  cards  incorrect  (Bad  punch)  (Cards  3-8) 

34062  -  Control  cards  missing  (Cards  9-26) 

34063  -  Control  cards  incorrect  (Bad  punch)  (Cards  9-26) 

34073  -  Control  cards  missing  (Cards  27-35) 

34074  -  Control  cards  Incorrect  (Bad  punch)  (Cards  27-35) 

34564  -  End  of  File  Mark  on  HSDDA  tape  found  while  reading 
data  (Mode  1) 

34605  -  Iteration  numbers  do  not  match  -  tried  3  times 
34636  -  Iteration  number  ■  0  (Mode  1) 

34665  -  Iteration  number  *  0  (Mode  2) 

34706  -  End  of  File  Mark  on  HSDDA  tape  found  while  reading  data 

34742  -  Iteration  number  *  0  (Mode  3)  to  restart  transfer  to  34671 

35044  -  Maximum  number  of  records  have  been  processed  as 
specified  in  control  card  1 

35603  -  Identification  of  HS  DDA  tape  and  identification  number 
punched  on  Control  Card  2  are  not  identical 

35602  -  No  identification  number  has  been  punched  on  Control 
Card  2 

2.13  INITIALIZATION  OF  THE  STRAP-DOWN  PROCESSOR  -  The  AU*k  gener 
ated  by  the  HSDDA  are  approximations  to  the  central  differences  indicated  in 


The  extrapolation  formula 


4«Jk  ‘  -  Ti4Ujk("‘11  *  n4U,>-21 


is  used  to  approximate 


V'-»  •  ujk«,»-i)- 


(n-53) 


Initialisation  requires  the  computation  of  the  Al^*  *s  or,  equivalently,  of 
products  (involving  angular  rates)  which  sum  to  them.  An  alternate  approach 
requiring  values  of  the  U^1*  only,  is  to  compute  suitable  ‘s  from  the 
formulas 


AUj>  -  •  Vti)-  v-'* 

AUj*(0)  «  AUj^l)  -  Ujk(ti)  -  2Ujk(*o)  +  (H-54) 

Al£(-1)  «  A  Ujk(0)  -  Ujk(ti)  -  2Ujk(to)  +  Ujk(t.x) 


These  formulas  are  exact  if  U^t)  is  a  quadratic.  Furthermore,  elimination 
of  the  AU*  *s  between  (H-54)  and  (11-52)  does  indeed  yield  (11-53)  exactly. 

JR 

During  the  first  iteration  the  computer  computes  A  U^(U-54)  from  initially  given 
Ujk's  and  w's.  It  is  possible,  however,  to  determine  w's  from  the  U^t,)'* 
which  will  generate  the  desired  values  as  given  by  (11*54).  The  necessary  equa¬ 
tion  are  ^  _  u  ^GJAU^d)  -  ^^(tJAUj'jd) 

U*w.  ■  Ujj(te)AUjj(l)  -  Ujj(t#)AU*j(l)  (U”55) 

U*w,  -  Uj2WAU*(l).  Ujj^AU^d) 


11-87 


where  U*  =  U.3,  +  Us,  +  U®  . 

jl  j2  j3 

If  computations  (in  the  HSODA)  are  to  be  performed  for  more  than  one  inertial 
axis,  (H-55)  is  not  sufficient,  since  it  determines  a  set  of  u's  for  which  the  rata- 
tion  about  the  j**1  inertial  axis  is  zero.  What  is  necessary  for  an  orthogonal 
inertial  system  is  to  calculate  an  w-vectorfrom  (11-55)  for  each  j.  The  appropri¬ 
ate  w-vector  to  be  used  for  the  first  iteration  is  then  given  by  half  the  sum  of 
the  three  vectors  so  calculated.  The  multiplicative  factor  8/9  must  also  be 
included  in  the  scaling  constants  to  defeat  the  extrapolation  in  the  input  accumu¬ 
lator  unit. 

2.  14  PROGRAMMING  METHODS  FOR  PROCESSOR  EVALUATION  FOR  REAL 
DATA  INPUTS  -  Real  data  which  is  recorded  during  actual  flight  is  not  expected 
to  be  in  a  form  directly  assimilatable  by  the  strap-down  processor.  In  addi¬ 
tion  to  recording  format,  there  are  more  fundamental  differences  stemming 
from  digital  representation  and  sensor-transducer  accuracy.  The  latter 
problem  is  overcome  by  making  yardstick  calculations  on  the  same  transformed 
real  data  that  is  input  to  the  strap-down  processor.  The  problem  of  trans¬ 
forming  data  recorded  by,  say,  a  pulse  stream  analogue  to  digital  converter  to 
the  form  in  which  a  whole  word  sampler  would  present  the  real  data  is  analysed 
in  the  next  section. 

2. 15  INTERPOLATION  CALCULATIONS  ON  WADD  SUPPLIED  DATA  FOR 
GENERATION  OF  STRAP-DOWN  PROCESSOR  INPUT  DATA. 

A.  Proposed  Methods  -  Data  supplied  by  WADD  for  strap-down 

processor  evaluation  which  is  the  recorded  output  of  a  pulse  stream 
analogue  to  digital  converter  must  be  put  in  a  form  assimilatable 
by  the  breadboard  processor  (it  must  be  subjected  to  short  lag 
smoothings  corresponding  to  analogue  filtering  in  the  required 
analogue  setup  of  the  strap-down  system  assumed  in  strap-down 
processor  design). 


11-88 


D-89 


decimal*.  Notice  that  for  Case  1,  9(5u»)  -  (Old  i^)  «  0  and  9  (8ui)  -  (Old  u,)  *  -0.  011718  709953. 
Corresponding  constants  for  R.  Fine's  E  Program  are: 

A  - -0.666666  291240  B0  *  (652  525  221  362), 

B  >  0.  001892  089378  B4  -  (000  017  377  777). 


The  WAOO  aupplied  data  must  be  processed  by  an  interpolation 
calculation  which  is  capable  of  yielding  each  appropriate  equivalent 
(or  interpolated)  data  point  for  the  specific  sampling  time,  at  which 
the  processor  would  normally  sample  real  time  data  represented 
by  the  WADD  supplied  data.  Two  possible  interpolation  calcula¬ 
tions  are  derived  in  the  following  sectione  for  methods  differing  in 
the  degree  of  polynomial  assumed  and  the  number  of  fit  data,  for 
each  computed  data  value.  The  first  analysis  assumes  computa¬ 
tions  based  on  step  change  in  slope  as  a  function  of  system  parame¬ 
ters  which  occur  in  the  three  point  quadratic  fit  method.  The  second 
analysis  evaluates  fractional  error  from  that  of  equivalent  inputs  for 
preprocessor  section  outputs.  An  error  of  fractional  amplitude 
fOQ3/24  (90*  out  of  phase  with  the  input  is  0.  6  x  10"*for  1  cps  inputs 
to  a  processor  with  266  it/sec).  While  the  step  slope  changes  at 
tape  data  points  for  a  three  point  quadratic  interpolation  routine  are 
•nappreciable  fraction  (*?  1  percent  over  1  cps  signals)  of  the  total 
slope,  the  effect  on  short  integrals  is  small  (0.  6  x  10~®for  a  1  cps 
signal),  being  at  the  border  line  of  effecting  the  20  bit  input  to  the 
processor.  *  If  the  errors  were  similar  to  the  roundoff  error  of 
independent  distribution  the  system  error  would  be  small.  If  the 
errors  were  similar  to  the  roundoff  error  of  a  bias  nature,  the  sys¬ 
tem  error  would  be  large  in  one  hour.  A  heuristic  analysis  based  on 
the  assumption  that  the  errors  have  a  correlation  time  of  o.  4  sec  for 
1  cps  inputs  lea£s  to  a  net  error  of  0.  6  x  10  X  v  0.  4  sec  360*  sec 
■  2.  16  x  10  radians  u_  5  arc  sec  in  inertial  reference  computation 
after  one  hour.  This  is  a  small  but  appreciable  error.  The 
conjecture  that  the  use  of  interpolated  values  for  both  the  pro¬ 
cessor  and  for  the  evaluation  of  GP  solutions  removes  most 


to  ■  2rrf t,  f  -  frequency,  r  -  iteration  interval  of  processor  outputs. 
#  Assuming  maximum  angular  rate  frequency  of  1.  5  cpe. 

a- 90 


of  the  difference  of  solutions,  ie  complicated  if  a  third  order 
algorithm  ie  ueed  in  the  latter,  eince  diecontinultiee  occur  in  the 
higher  differencee  utilized.  Theee  consideration*  together  with 
the  unacceptable  complexity  of  a  four  point  cubic  interpolation  imply 
the  preferable  use  of  the  three  point  quadratic  formula  type,  whose 
calculation  function  is  presented  in  the  following  section. 

B.  Derivation  of  Three  Point  Quadratic  .Interpolation  Method  -  A  quad¬ 
ratic  interpolation  formula  of  the  form 


I  *  *  I  +  X(p-n)  ♦  u(p-n)a 
p  n 


(11-56) 


is  correct  at  p  *  n.  At  p  =  n+1,  n-1  correctness  requires 


for  which 


‘n+1 

*  1  +  X  +  u 
n 

(H-57) 

£n-l 

=  I  -  X  +  u 
n 

(11-58) 

u  * 

^‘n+l 

2 

(U-59) 

\  — 

AI  .  +  AI 
n+1  n 

(11-60) 

A  ■ 

2 

The  quadratic  interpolation  formula  ie 

*  (AI  +  AI  )  A* I 

Ip  *  ln  +  "  (P  -  »>  ♦  -7*  <P  *  «>  <«-“> 

which  for  p  ■  t/T  has  the  form 

(AI  +  AI  )  A*I 

I  (t)  ■  In  +  n  (t  -  nT)  +  (*  •  “T)*  (11-62) 


11-9  i 


Computation  of  I*(t)  each  interval  t/M  starting  at  t  «  nt  is  investi¬ 
gated  in  terms  of  a  difference  operator  A*(  )  defined  by  A*X  * 
X({j)  -  xj^  .  P  (11-63) 

In  terms  of  counts  at  t/M  intervals  where  t  *  ~  , 


.  (A I  +  AI  ) 

‘r/M  -  ‘n  *  -TST^  (r  *  + 


A*  I 


h+1 


li?" 


(r  -  nM) 


(11-64) 


The  application  of  the  A*(  )  operator  yields 
(AI 


aX/M» 


2M 


+  AD  a*i  _  T 

♦  — jjr  -  nM)  -  1/2J 


(11-65) 


&  I 


n+1 


For  interpolation  from  t  «  nT  to  t  «  (n  +  1)t  at  intervals  of  T/M, 
the  three  numbers,  I  -  and 


o 

n 


(11-66) 


5 

n 


(A I  +  AI  ) 
n+1  n 

2M 


(U-67) 


are  adequate  for  a  sequential  generation  by  pure  summations  in  an 
interpolator  program  used  in  input  tape  preparation  for  the  strap- 
down  processor.  Tape  preparation  by  a  computer  with  slower 
multiplication  operation  than  two  addition  times  should  be  less 
costly  using  the  above  calculation  procedure  than  methods  using 
multiplication. 


c. 


Derivation  of  Four  Point  Cubic  Interpolation  -  A  cubic  infcerpola- 
lation  formula  of  the  form 

Ip*  ■  In  +  X(P  -  «)  +  u(p  -  n)8  +  v(p  -  n)3  (11-68) 

ie  correct  at  p  *  n,  At  p  =  n  +  2,  N  +  1,  N  ■  1  correctness  requires 


I  *  I  +  21  +  4u  +  8v 
n+2  n 


(11-69) 


I  ^  «  I  +  X  +  u  +  v 
n+1  n 


I  ,  *  I  -X  +  w-v 
n-1  n 


Adding  the  last  two  equations  we  obtain 


A*I 


u 


n+1 


(11-70) 

(11-71) 


(11-72) 


Eliminating  X  and  substituting  u 


A*  I 


h+2 


(11-73) 


also 


1  -  - u  - v 


(II-74) 


The  four  point  cubic  interpolation  formula  has  for  p  ■  t/r  the  form 


I*(t)  ■  In  +  -£(t  -  n+)  ♦  ^y(t  -  nt)*  ♦  -^g(t  -  nr)*  (11-75) 

Computation  of  l*(t)  each  interval  t/M, starting  at  t  «  nr, is  investi¬ 
gated  in  terms  of  a  difference  operator  &*(  )  defined  by 

A*(Yp)  ■  X(p+/M)  -  X  Qp  -  1)t/mJ  .  (11-76) 


11-93 


Substituting  t  ■  •  in  the  I*(t)  formula 

XWM  B  Xn  +  "  nM)  +  ^(r  •  nMf  +  £*(r  '  nM)’ 


(11-77) 


since 


A*  (r  -  nM)a  «  2  [(r  -  nM)  -  I/2J  (11-78) 

A*  (r  -  nM)*  -  3  [<r  -  nM)*  -  (r  -  nM)  +  l/s]. 

Application  of  the  operator  A*(  )  to  I*  r/M  yields 

A*!^^  ■  -j  +  [<r  -  nM)  -  I/2J  +  jji[(r  "  nM>*  "  (r  ”  nM)  *  l/^J 

.  6v 


a**i*  ,  ik 

r/M  M> 


A*  Xr/M  * 


M* 


denoting 


a  «  6vn/ M* 
n 


R  « 
n 


6vn 


Zun  tw 

uF  ‘  U 


where  X,  u,  v  (for  interval  t  ■  n+  to  (n  +  1)t)  are  computed  in  the 
manner  derived.  Then 

r 


A**  I* 


r/M 


aY 


r/M 


8n  +  2  ar* 

r*  «  nM+1 


1 

r*  ■  nM+1 


(11-80) 

(11-81) 


U»94 


(11-82) 


define  a  program  computation  requiring  three  add  times  per  iteration. 


n-95 


CHAPTER  UI 


ANALYTICAL  DEVELOPMENTS  DURING  PHASE  IX  IN  THE 
GENERAL  THEORY  OF  REAL  TIME  COMPUTATION 

3.  0  MOTIVATION  OF  INVESTIGATIONS  AND  APPLICATIONS  OF  RESULTS- 
The  relatione  of  input  variables  of  the  external  world  to  the  internal  operation 
of  a  computer,  eubject  to  certain  general  constraints  in  numerical  operation, 
provide  a  basis  for  development  of  a  general  theory  of  real  time  computation 
which  may  otherwise  have  an  abstract  nature  in  that  mechanization  is  to  be 
deduced  subsequently  (independently  taking  into  account  hardware  factors)  in 
applying  the  principles  developed.  The  feasibility  and  use  of  such  general 
investigations  stems  from  the  fact  that  a  computer  is  designed  to  accomplish 
the  analytical  task.  A  major  analytical  development  during  Phase  I  was  the 
derivation  of  a  theory  of  numerical  integration  applying  to  the  most  general 
type  of  integration  of  an  incremental  computer  (actually  appropriate  in  nearly 
all  applications)  which  is  of  the  classical  operation  of  Stieltjes'  Integration. 

The  algorithms  developed  foY  precision  integration  of  the  Stieltjes  types  were 
seen  to  be  previously  accomplishable  within  a  framework  of  classical  numeri¬ 
cal  integration  techniques  only  by  such  special  methods  as  the  Runge-Kutta 
which  requires  relatively  complex,  special,  and  inefficient  computer  mechani¬ 
sation.  Integration  algorithm  is  generally  accomplished  in  digital  mechanisa¬ 
tion  by  combinations  of  (1)  extrapolation  type  operations,  i.  e.  linear  weighings 
of  past  function  values,  (in  this  cass  with  relativsly  simply  realised  weigldngs), 
and  (2)  transfer  operations.  In  contrast  to  algorithms  of  the  Runge-Kutta  type, 
the  Stieltjes  algorithms  developed  made  possible  the  minimum  number  of  transfer 
operations  (one  per  integral  increment)  and  permitted  maximum  rate  of  com¬ 
putation.  Actually  all  algorithms  have  an  undesirable  degree  of  (1)  far  transfer 
mechanisations  of  short  bit  length  with  respect  to  independent  variable  in  pre¬ 
cise  computation  of  Stieltjes  integral  increments.  The  undesirability  stems 


XII-1 


from  the  requirement  to  quantize  independent  variables  at  given  short  bit 
lengths  to  an  effective  accuracy  of  much  higher  bit  lengths.  Hence  analytical 
investigations  were  launched  to  determine  the  feasibility  of  removing  this 
design  problem,  one  which  actually  occurs  in  internal  computation  alone 
since  multi* transfer  design  is  typically  only  single  or  several  bit.  The  re¬ 
sult  was  the  development  of  the  theory  of  numerical  Stieltjes  integration  in 
terms  of  "virtual  variables"  i.  e. ,  variables  closely  related  to  desired 
variables  but  which  are  involved  in  computations  simpler  than  the  desired 
variables.  External  inputs  to  a  real  time  computer  are  not  virtual  variables. 

It  is  possible  to  alter  integration  algorithms  used  in  input  processing  to  gener¬ 
ate  answers  in  virtual  variables.  The  major  result  of  the  investigation  was 
the  proof  that  there  exist  virtual  variables  which  in  general  may  be  utilized  in 
numerical  integration  with  respect  to  general  independent  variables  as  though 
they  changed  linearly,  i.  e. .  like  time,  although  the  actual  variable  does  not.  In 
consequence  virtual  variable  numerical  Stieltjes  integration  may  have  pre¬ 
cision  corresponding  to  multi-transfer  bit  length  potential  using  direct  multi¬ 
transfer  of  that  bit  length.  The  generation  of  computer  outputs  involves 
simple  transformation  of  the  virtual  variable  using  available  data.  The  prac¬ 
tical  necessity  of  mechanising  the  transformation  depends  on  existence  of 
high  frequency  variables  and  relatively  high  precision  requirements.  The 
theoretical  relation  of  desired  and  virtual  variables  provides  a  further  demon¬ 
stration  of  the  fundamental  nature  of  the  computation  types  evolved  in  the 
contract  study,  the  realisation  of  which  has  enabled  the  design  of  the  first 
real  time  computer  (for  high  frequency  variable  applications)  capable  of  high 
precision  with  mechanisation  of  modest  complexity. 

Another  fundamental  problem  in  incremental  computation  finds  practical  moti¬ 
vation  for  solution  as  a  result  of  certain  computation  error  characteristics 
rather  than  mechanisation  characteristics.  It  is  generally  recognised,  and 


III- 2 


was  quantitatively  established  during  Phase  II  that  division  in  a  conventional 
ODA  is  relatively  highly  inaccurate.  Aerospace  applications  for  near  orbital 
speed  missiles  which  undergo  large  altitude  variations  require  precise  division 
capability  for  solution  of  the  basic  navigation  equation,  coordinate  transforma¬ 
tion  from  satellite  to  earth  coordinates,  and  many  other  important  functions. 
For  a  full  aerospace  mission  division  algorithm  mechanization  is  highly 
desirable  to  obtain  the  required  accuracy,  though  a  breakthrough  in  modified 
conventional  DDA  design  (Chapter  DC)  presents  a  design  alternative.  To 
develop  a  division  algorithm  mechanization  which  yields  high  precision  there 
must  be  a  fundamental  theory  of  numerical  quotient  computation.  Such  a 
theory  was  developed  by  t*  ansforming  the  results  of  the  theory  of  numerical 
integration  into  numerical  quotient  computation  form.  The  results  of  this 
analysis  are  a  basis  of  the  design  of  the  full  scale  computer  developed  in  the 
contract  study,  and  have  been  simulated  to  confirm  their  validity. 

3.  1  SIMPLIFIED  COMPUTATION  IN  TERMS  OF  VIRTUAL  VARIABLES  IN 
EXECUTING  STIELTJES  NUMERICAL  INTEGRATION  PROCESSES  - 

A.  INTRODUCTION  -  An  investigation  into  the  theory  of  Stieltjes 

numerical  integration  for  incremental  computation,  including  in¬ 
put  processing  and  internal  computation,  has  led  to  ar  important 
combined  computation  structure.  The  relation  of  incremental 
computer  processing  for  function  generation,  to  over-all  com¬ 
puter  processing  involving  real  time  input  variables,  has  been 
delineated  in  quantitative  form  with  important  implications  of 
significantly  simplified  mechanization  capable  of  precise  com¬ 
putation  for  both  types  of  computers.  Generally,  it  has  been 
shown  that  Stieltjes  numerical  integration  can  be  carried  out  in 
terms  of  "virtual"  variables,  by  which  is  meant  variables  not 


III- 3 


equal  to  the  desired  variables  but  bearing  a  fixed  transformation 
relationship  to  them.  Proper  algorithm  in  terms  of  virtual 
variables  can  be  made  simpler  for  an  internal  computer,  or 
function  "  generation"  computer,  in  that  the  classical  non- 
Stieltjes  numerical  integration  algorithms  apply  to  them.  How¬ 
ever,  in  principle  it  is  required  that  the  true  variables,  when 
they  are  to  be  extracted  from  the  computer  for  external  use,  be 
generated  by  inverse  transformation  of  the  virtual  variables. 

Since  it  was  shown  in  continuing  analysis  that  there  exists  a 
class  of  virtual  variables,  which  are  slightly  lagged  relatively, 
and  any  one  of  which  satisfies  the  simplified  algorithm  relation¬ 
ship,  the  specific  virtual  variables  which  most  closely  approxi¬ 
mate  the  true  variables  could  be  chosen.  Analyeie  indicated  that 
the  pertinent  virtual  variable  differs  from  the  true  variable  by  a 
second  difference  of  magnitude  of  (f/IR)*  approximately,  where 
f  •  frequency  of  variable  and  XR  a  iteration  rate;  for  example  if 
f  «  0.  5  cps,  IR  ■  100  iter/sec  the  virtual  and  time  variables 
differ  by  <0.  003  percent.  Typically,  the  greatest  demand  for 
accuracy  in  high  frequency  variable  computations,  is  within  the 
computer  (where  feedback  effects  can  cause  error  growth),  rather 
than  in  outputs.  The  outputs  have  the  highest  accuracy  demands, 
in  applications  such  as  navigation,  in  the  variables  with  primarily 
low  frequency  content.  Thus,  outputs  in  contrast  to  inputs  and 
intermediate  variables  of  a  precision  computer  may  be  taken  with 
usually  close  approximation  as  the  virtual  variables.  The  output 
mechanism  of  the  computer  system  is  analysed  elsewhere  in  the 
report,  in  relation  to  the  contingency  (for  general  computation  in 
aerospace  applications)  of  conversion  by  elementary  transforma¬ 
tion  from  virtual  variables  to  true  variables  in  outputs.  The  internal 


computer  deaign  in  this  study  is  the  proposed  QODA  whose  regis¬ 
ters  contain  the  information  available  and  necessary  for  the 
virtual  variable  to  true  variable  conversion  in  mechanisation 
through  alternative  use  of  logic  necessarily  present  for  integrator 
operation  in  effecting  the  precision  algorithm.  The  carrying  out 
of  this  design  modification  is  determined  by  the  particular  com¬ 
puter  applications,  i.  e.  their  precision  requirement  and  variable 
frequencies. 


The  theory  of  the  combined  input  processor,  internal  computer 
algorithm  structure  involving  virtual  variable  computation,  leads 
to  less  complex  internal  computer  design.  In  principle,  an  input 
processor  designed  as  part  of  this  structure  should  generate 
virtual  rather  than  true  variables.  However,  the  following  factors 
led  to  a  decision  to  fabricate  the  previously  proposed  (strap-down) 
input  processor  design: 


1.  The  previously  proposed  input  processor  design  generates 
true  variables  most  readily  used  to  evaluate  design 
performancs. 

2.  A  later  combined  input  processor  intsrnal  computer  system 
has  an  input  procsssor  da  sign  obtainable  by  modest  logical 
design  modifications  rslative  to  ths  previously  proposed  unit. 


B. 


ANALYSIS  -  Consider  the  generating  function  for  parametric  cal¬ 
culation  form  associated  with  unlagged  integrand  and  independent 
variables,  in  numsrieal  Stieltjes  integration  (Chapter  5,  Section 
5  of  the  Phase  I  final  report): 


F  (a.  b) 


iri"ii-vi. 
‘"<KbiJ  L  •‘b  J  s 


(3-D 


O 


III- 5 


Denote  the  operators. 


[>»U-  v]  ** 

(3-2) 

r  i”<i -vi  *r  -*b  i 

[  -*>  J 

V* 

(3-3) 

the  operator  O  with  respect  to  any  variable  has  inverse  O*1  since 

0.  O'1  ■  1  as  seen  in  the  ease  O  ~l. 

D 

Then 

r  <*•  b>  «  ®ab  S’1  «b 

(3-4) 

A  similar  relationship  holds  for  lagged  or  lad  Integrand  in  the 
parametric  calculation  eases,  since  the  associated  generating 
function  for  p  iterations  lag  (p  may  be  positive  or  negative). 

FMfcwTr  *  I*  £lSi_£L£i 

l  *  “J  (,b)p 

(3-3) 

then 

re  (a,  b)  ■  r  fa  *[  b P  L 
<.b)p  ^ 

(3-6) 

substituting  the  F  (a,  b)  function 

„  .  ,  r  *\b  T 

°r  ** 

-t 

1  A 

[(.wp  hi -*^ij 

[bp  In  (1  - 

r 

■  •\b#*V0  *b 

(3-7) 

in- 6 


where: 


8* 

x 


-6 

x 


(xpln  (1  -  6^) 


(3-8) 


x  being  either  ab  or  b.  If  computation  of  lagged  integral  is  con¬ 
sidered,  then  note  from 


AIn  -  F  (a,  b)  yn  xn 

ai  (.wi . r  w  f r  (» S)  u  b")  l 

“  (»b)P’q  L  n  *  J 

any  p,  q,  from  which 


AI 

-q 


r  (».  b)  bp"q 

(eb)p“q 


y  * 

'n-p  n-q 


(3-9) 

(3-10) 


(3-11) 


The  aeaociated  generating  function  integral  and  independent 
variable  lagged  q  iterations  and  integrand  legged  p  iterations  la 


...  M  . 

'*•  bl  («b)p-‘> 

(3-12) 

the  superscripts  on  the  left  Indicating  q  legged  integral  and  in- 

dependent  variable,  p  legged  integrand 

r-T  J  '***  lol 

V 

• 

o’0* 

“*  1 

N  - 

b»  l  <.«>-»  1.  (1 .  WJ 1 

.b^lnd-^J  b 

•  *s • *r‘  •  s 

(3-13) 

m-7 


where 


6** 

x 


(q.  p)  ,  I"  "6*  1 

Lxp"q  In  {1  -  6)  J 


(3-14) 


In  terma  of  the  formula  for  integration  in  theae  variablea 


AI  « 
n-q  ^ 


»*£  y  o  (W>* 
ab  7n-p  b 


-l 


6.  x  )  1 
b  n-q  I 


(3-15) 


Conaider  now  a  computation  in  terma  of  virtual  variablea  of  x,  y 
and  integral  deaigned  to  be  aimpler  than  that  of  direct  parametric 
algorithm  computation.  Take  the  virtual  x^  variable  to  be  x* 
defined  by 


x*  «  e**"x  x 


n-q 


then 


(3-16) 


x  ■  •**  x* 

n-q  b  n 


Taking  the  virtual  variable  of  integral  to  be 


AI*  ■  •*£ 
n  ab 


-l 


AI 


n-q 


then 


AI  >  0*P  I  y 
n-q  ab  [7n-p 

multiplied  by  OJg  yielda 

AI*  ■  y  o  x*  6. 
n  n-p  n  b 


o  4.  x* 
b  n 


] 


(3-17) 


(3-18) 


(3-19) 


(3-20) 


the  virtual  variablee  I*,  x*  being  formed  by  identical  operatore 

&  n 

and  delay  on  the  de  aired  variable. 


consider  computation  in  which  y  variable  is  converted  to  a  virtual 
variable  of  the  identical  kind  to  that  of  1,  X,  for  which 

y  «  0**  y*  (3-22) 

7n-q  a  7n 

note 


y  ■  y  a**”**  g  y*  (3-23) 

7n-p  'n>q  a  'n 

then  substituting  y  in  the  partially  virtual  variable  equation 
n-p 

(3_24) 

where 


■  »m  (»•*») 

thus. 

AI*  ■  •  yj  o  x*  6.  (5-26) 

ia  a  virtual  variable  algorithm  involving  a  single  operator  process 
on  integrand  but  not  independent  variable  and  a  single  whole  number 
multiplication  per  iteration.  The  corresponding  algorithm  de¬ 
scribed  in  previous  analysis  ae  a  nonparamstric  algorithm  not 


III- 9 


involving  virtual  variables  involves  a  single  operator  but  two  or 
more  whole  number  multiplications. 

This  virtual  variable  algorithm  can  be  directly  applied  in  any 
incremental  function  generation  (without  external  inputs)  by 
simply  starting  variables  at  virtual  values.  The  generated 
virtual  variables  are  at  subsequent  times  related  to  the  desired 
variables  by  a  linear  operator  generally  differing  slightly  for  the 
actual  variable.  In  closed  loop  computations  the  crude  use  of 
approximate  algorithm  could,  of  course,  lead  to  a  build  up  of 
large  errors.  But  this  is  avoided  in  the  virtual  variable  case, 
because  only  the  externally  observed  variable  is  in  error. 

The  problem  of  processor  application  of  the  virtual  variable 
approach  is  associated  with  the  problem  in  generating  x*  where 


(3-27) 


III- 10 


assuming  that  an  input  delay  of  rT  is  reliable  then 


bP“rx 


*[  1  +T  *  "T  +  •  •  •]  11  •  V**'  Vr 

+  6  *1  x 

2  /  b  J  n-r 

taking  (p-r)  ■  1/2  then 

X.  12*  .  1*3 

hence  for  (p-r)  ■  1/2, 

{*-■£  4- 


x*  -| 


_  1  As  _ 

a  n-r  24  »-r 


(3-28) 


(3-2?) 


(3-30) 

(3-31) 


ia  the  relation  of  the  actual  desired  variable  to  the  virtual  variable 

computed.  Here  «n  can  be  taken  ae  AIg,  ygl  aa  well  aa  the 

variable  x  of  (3-1).  Than  the  algorithm  computation  in  terms  of 
n 


m-ii 


virtual  variables  is  Eq  (3-2)  for  undelayed  variables,  which  in 
direct  computation  form  is 

K  -[»:■  <3-32> 

For  undelayed  and  also  delayed  variables  the  classical  algorithm 
applying  to  the  case  of  uniformly  increasing  independent  variables 
holds  for  general  independent  variables  in  virtual  variables. 

3.  2  ANALYSIS  OF  HIGHER  ORDER  PARALLEL  INTEGRATION  ALGO¬ 
RITHM  FOR  COMPUTATIONS  INVOLVING  DIVISION  -  Parallel  integration 
algorithm  of  a  QDPU  (Quotient  Differential  Processing  Unit)  was  investigated 
for  modes  involving  division.  The  sought  for  QDPU  design  has  the  purpose 
of  equivalently  computing  in  the  manner  (apart  from  round  off  properties)  of 
an  integrator  ensemble  in  each  of  two  parallel  channels.  The  quotient  gen¬ 
eration  action  of  the  QDPU  differs  in  fundamental  form  from  that  of  a  pure 
integration  process.  A  theory  of  higher  order  integration  algorithm  for  the 
QDPU  for  second  order  integration  accuracy  is  of  interest  in  an  incremental 
computer  design.  The  desired  second  order  algorithm  is  two  levels  of 
accuracy  greater  than  those  designed  in  existing  DDA  hardware.  Some  in¬ 
vestigators  have  formed  and  used  the  DDA  algorithm  problem  as  the  re¬ 
sultant  of  a  goal  to  accurately  execute  incrementation  of  algebraic  relations 
and  to  solve  difference  equations  inferentially  obtained  from  the  application 
computation.  Actually  the  great  majority  of  application  computations  involve 
a  close  relationship  between  both  algebraic  and  integral  incrementation. 
Algebraic  relation  incrementation  is  readily  expressible  in  terms  of  equiva¬ 
lent  integral  incrementation,  however,  the  reverse  process  is  possible  only 
through  the  use  of  appropriate  integration  algorithm  in  one  manner  or  another. 
The  concept  of  effecting  accurate  computation  without  mechanization  of  pre¬ 
cision  integration  relies  on  finding  a  set  of  difference  equations  (ordinarily  a 


III- 12 


modification  of  the  application  differential  equations)  which  are  equivalent  to 
the  desired  calculation  for  incremental  computations.  In  the  applications  of 
real  importance  the  desired  calculation  is  almost  without  exception  of  such 
non-linear  nature  and  coupling  complexity  that  it  would  be  a  matter  of  luck  to 
find  the  equivalent  set  of  difference  equations  sought,  and  usually  a  price  in 
computation  accuracy  would  result.  One  of  the  two  outputs  of  the  QDPU  for 
a  mode  involving  an  element  of  division  has  the  purpose  of  executing  the 
computation 


A  0 

n 


n  t 


dx  (t) 


(3-33) 


The  theory  of  Stieltjes  numerical  integration  for  virtual  variables  relates 

AQ  to  P  ,  U  ,  x  and  various  differences  thereof  for  a  given  order  of 
n  n  n  n 

accuracy.  The  QDPU  mechanization  effects  transfers  and  decision  processes 
in  a  manner  differently  than  the  direct  algorithm  form.  The  analytical  prob¬ 
lem  of  deriving  explicit  QDPU  algorithm  is  that  of  finding  the  equivalent 
mechanizable  algorithm.  Analysis  has  led  to  an  equivalent  algorithm  of 
second  order  accuracy,  the  terms  of  which  are  mechanizable  with  second 
difference  communication  with  the  possible  exception  of  a  single  small  term 
involving  a  second  difference  term  of  the  independent  variable.  The  cor¬ 
responding  case  of  ordinary  integration  in  virtual  variables  did  not  require 
a  second  difference  term  in  the  algorithm  of  the  independent  variable,  which 
was  the  purpose  of  the  introduction  of  virtual  variables.  The  implications  of 
this  result  will  be  examined  further  in  relation  to  theory,  mechanization,  and 
accuracy  for  the  QDPU. 


HI- 13 


The  theory  of  Stieltjee  numerical  integration  developed  in  Phase  I  and  simpli¬ 
fied  algorithms  for  machine  computation  dealt  with  ordinary  incremental 
integration.  Thus  the  operation 


A  e  - 

n 


/ 


n  r 

y  d  x 
(n  -  1)t 


(3-34) 


is  effectively  performed  on  the  given  series  of  yR  values  given  in  the  series 
of  AXn,  by  carrying  out  the  computation  in  virtual  variables  y*  and  x*  using 
an  algorithm  of  form 

A  •*  -  [yj  +  X!  A  £  +  X,  A*  yj  +  .  .  ]  A  Xj  (3-33) 


The  problem  of  incremental  computation  where  yQ  is  not  given  but  rather, 
where. 


y  ■  P  /v  0-36) 

n  n  n 

involves  the  given  information  A  p^,  A  v^  series  and  pqj  vq  for  OOPU  oper¬ 
ation.  Substituting  Xq.  (3-36)  in  Xq.  (3-35)  the  computation  should  perform 
in  one  way  or  another  in  virtual  variables,  the  asterisk  of  which  is  here¬ 
after  dropped  for  brevity. 

A*.  -[r  ’  v(r)  *  '■"(r)*  •  •  ]ax.  «>-”> 


•Chapter  5  Part  5  First  Phase  Technical  Documentary  Report  on  Develop¬ 
ment  of  an  Airborne  HSDDA,  H.  W.  Banbrook,  7  July  1961.  Contract 
AT  33(6l6)-6936 


III- 14 


The  following  analysis  is  carried  out  to  achieve  second  order  numerical 
integration  accuracy.  The  first  order  difference  of  a  quotient  is  determined 
to  second  order  accuracy  as  followss 


p.  r  <1  -  /p _)  i 

(p  /v  )  ■  p  /v  -  p  ,/v  .  ■  —  1 1  -  — - T-  7-  ■  r  I 

**n  n  *n  n  rn-l  n-1  v  I  (1  -  Av  /v  )  I 

n  L  n  n  J 


AP_  P«  AP_ 

— —  -  — —  Av  +  ■■ 1  "  - 

v  v  *  n  v  * 
n  n  n 


-  — ,  (Av  )* 
v®  n 
n 


For  the  same  accuracy  level  when  used  in  Eq.  (3-37), 


a'<Pn/v„>  ■  A  (A  -  4-r  -5s.  ivj 

'  n  n  / 


(3-38) 


using  the  exact  difference  formula  for  a  product.  In  Eq.  (3-38)  if 

we  obtain  A (  -n  )  and  ifp  -»  AV  we  obtain  D  (AV  /V  ).  Truncating  all 

\  Pn  /  »  n  n  " 

terms  of  higher  order  than  second, 


v 

n 


2  Av  Ap 
n  n 


v  ' 
n 


v 

n 


A*  v 


(3-40) 


in- is 


Substituting  these  results  in  Eq.  (3-37), 


AS 


n 


v 

n 


1+  (Xj 


Av  Ap 

2*a>  '  +  <2X» 

n 


(Avn)# 


-X# 


(3-41) 


The  QDPU  must  effect  Eq.  (3-37)  or  Eq.  (3-cij  without  using  divis<on  directly. 
To  do  this  assume  a  realizable  form  to  QDPU  computation  and  investigate  the 
level  of  equivalence  to  Eq.  (3-41)  possible  by  parameter  adjustment.  The  form 


A  0 


‘bDPU 


P  +  X,B  AP  +  X,*  a'P^  +  ex' 
n _  n  *  n  .y 

Vn  +  w»"  AVn  +  A"Vn  +  *»  J  n 


(3-42) 


is  realizable  in  the  stated  sense  (assuming  that  Cn  t,  which  are  of  second 

order  are  individually  realisable)  as  Is  shown  in  later  presentation  the  theory 

of  QDPU  operation.  To  put  Eq.  (3-42)  into  the  form  of  a  polynomial  in 

AP  ,  AV  ,  similar  to  Eq.  (3-41)  use  the  relation  obtained  by  the  geometric 
n  n 

series: 


L 


n 


n 


(3-43) 


111-16 


Substituting  Eq.  (3-43)  in  Eq.  (3-42)  and  expanding  the  product. 


n  n 


(3-44) 


For  equivalence  of  Eq.  (3-42)  to  Eq.  (3-41)  to  first  order,  the  first  order 
terms  of  Eq.  (3-44)  must  equal  those  of  Eq.  (3-41),  hence  take 

\ 


The  difference  of  the  second  order  terms  of  the  QDPU  calculations  of 
Eq.  (3-42)  from  those  of  the  desired  calculation  Eq.  (3-41)  is 


(X,  -  fit) 


PnABV» 

vT" 

n 


«■ 


+  (X  >  +  xa  - 


/P 

2Mv* 

>  n 


(3-45) 


The  first  two  terms  may  be  nulled  taking 

X.*  -  X. 

Ms*  ■  X, 

The  last  term  is  predetermined  by  the  first  order  equivalence  conditions. 


rn-17 


The  error  of  the  QDPU  algorithm  with  X* 


*  . 

UlB  ^-1 


*  * 

a  u. 


Xa  is: 


(3-46) 


Such  realisable  choices  of  «lt  et»  ere  sought  so  that  the  error  is  minimised. 
First  consider  the  explicite  algorithms  Eq,  (3-35)  sought  for  realisation 
noting  that  the  QDPU  algorithm  has  analogous  form 

[P  +  X*AP  +  X.  A*P  +  e, 

-a - : _ a _ ! _ n  1 

Vn  +  Xl*Vn  +  X>A#Vn  +  e« 


■ 

AXn  (3-47) 


except  for  ci*  «a,  to  be  determined. 


For  unlagged  input  variables  •  -1/3,  X,  ■  -1/12,  hence  the  constant 
coefficient  of  the  error  term  Eq,  (3-46)  is 

(*i-  +  X!  -  2Xt)  »  -1/12 

For  lagged  input  variables  Xj  ■  +1/2,  X,  ■  +5/12  and 
(Xa*  +  Xj  -  2Xt)  -  -1/12 

the  same  value.  When  only  one  variable  is  lagged  it  is  readily  shown  that 
Eq.  (3-47)  is  replaced  with  an  expression  obtained  by  replacing  the  lagged 
variable  terms  with  that  obtained  by  the  substitution 

X  — X  +  AX  +  A*X  (3-48) 

n  n  n  n 


111-18 


the  choice  of  e1,  ea  being  unaltered  because  of  their  second  order  magni¬ 
tude.  Thus  the  choice  of  e1(  »,  is  generally  that  which  minimizes 


'AP 


nQDPUl 


where 


'  AP, 


QDPU 


1 

AV 

n 

/Pn 

*'  “12 

V 

(V"5 

n 

V  n 

,  P 

-  — 

€* 

_  lx. 

vv 

9 

Vn 

I  AX 


(3-49) 


Consider  choices  »s  and  their  implications.  If  we  choose 
-AV 


n 


12  4<W 


«8  «  0 


(3-50) 


then  «&eQDpu  »  0  formally.  The  «,  term  enters  as  a  small  second  order 
term  when  the  quotient  P^/V^  changes  in  a  regular  manner  e.  g.  where 
V  *  0.  No  satisfactory  way  of  effecting  the  term  generally  in  an  incre- 
mental  algoritbim  has  been  devised.  The  inclusion  of  the  term  in  special 
cases  will  be  discussed  later.  Approximate  second  order  algorithim  is 
given  with  e*.  e,  *  0  in  Eq.  (3-49).  Write  this  relation  (which  can  be 
derived  readily  by  approximate  analyses)  in  the  form 

0  «  P  AX  -  V  A° 
n  n  n  n 


where 


( >. 


(  >.  +  M(  >„  ♦».*•(  »• 


For  the  moment  assume  that  AO  is  estimated  in  some  manner  and  that  an 

& 


R  register  is  updated  according  to 


R 


R  ,  +  P  AX  -  V  AO 
n- 1  n  n  n  n 


(3-51) 


HI-19 


Then  if  AS^  had  been  estimated  exactly  (according  to  the  approximate  second 

order  level)  then  no  net  change  would  occur  in  R  in  that  iteration.  A 

residue  of  past  errors  is  reflected  in  R  on  the  value  R  Consider  the 

method  of  making  estimates  of  AS^  in  the  case  where  AS^  is  represented 

ae  a  single  increment.  The  AS  value  actually  available  ie  the  lagged  value 

estimated  at  the  previous  iteration.  Let  the  decision  that  output  AS  be 

n 

represented  as  a  given  magnitude  be  made  using  the  magnitude  of  R^  where, 

R  -  R  ,  +  P  AX  (3-52) 

n  n- 1  n  n 


where  R^  is  a  residue  corrected  for  all  past  decisions  by  appropriate 
addition  of  -v  AS  terms:  accordingly  the  relation 


R  .  -  V  .AS  . 
n-1  n-1  n-1 


1 


P-M) 


where  A8L  is  the  lagged  incoming  value  of  the  nth  Iteration  equal  to  the 
n  at 

computed  value  of  the  (n-1)  iteration.  Then  the  computation  of  Rq  is 
given  by 


R 


R  .  ♦ F  AX  -V  . 
n-1  n  n  n-1 


acL 

n 


(3-34) 


The  decision  for  output  AS  (no  overflow)  is  according  to  a  correction  which 

minimises  the  absolute  value  of  R  -  V  AS  . 

a  a  a 

In  multi-increment  computation  use  may  be  made  of  the  relation 

AS  -  AS  ,  +  A“S  (3-55) 

n  n-1  n 


where  AS  .  is  known  and  Aa0  is  to  bs  determined.  Then  the  ideal 
n-1  a 

R- register  value  is 


111-20 


(3-56) 


R  x  R  +PAX-VA0 
n  n- 1  n  n  n  n 

=  R  +  P  AX  -  V  A0  -V  A3® 
n-1  n  n  n  n-1  n  n 


Making  the  decision  as  to  the  magnitude  of  A8  0^  we  may  use  the  magnitude 
Rn  given  by 


n 


n-1 


P  AX 
n  n 


V  A0  , 
n  n-1 


(3-57) 


where 


R 

n- 


1 


R 

n- 


1 


A3® 


n- 1 


being  the  ideal  of  the  past  iteration  which  at  the  next  iteration  is  realisable. 
Thus  multi  increment  quotient  algorithm  determines  output  A2 0^  according 
to  magnitude 


n-1 


+  P 


AX  - 
n 


A0l  - 
n 


n-1 


A30l 


(3-58) 


where 

A®L  *  A®  ,  ,  A*®L  »  A8  9  , 

n  n-1  n  n-1 


the  subscript  indicating  lagged  outputs  which  are  inputs  to  the  computation. 

The  decision  for  output  A* 0  is  made  according  to  the  choice  which  minimises 

n 

the  absolute  value  of 

R  -  ?  A*  0 
n  n  n 


The  computations  for  different  digital  representations  are  presented  in  chapters 
Vn  and  Vni 

Consider  again  the  problem  of  the  (small)  second  order  term  e4  omitted  in  the 
algorithim  just  derived.  If  formally  included  the  more  exact  calculation  of  Rq  is 

R  «R  .  +Pi1aX  -V  A0L-V  .  i’e  L 

n  n-1  [  n  lJ  n  n  n  n-1  n 


III -21 


where 


-AV 

n 

P 

A  -2 

AV 

12 

n 

uA 

n 

-AV 

n 

P 

A  -2 

P 

AV  _ - n  A3 V 

12 

A  V 

n 

a  n  “  a  n 

n 

-AV 

n 

A*  A  4. 

AV  A*X  P 

n  n  n 

12 

U  V  T 

n 

12  V 

n 


If  the  primary  contribution  ia  from  the  firat  of  the  two  terma  of  etAXn 
then  improved  calculation  ia 


AV 


R 


R  ,  +  P  AX  -  X  A*  - 
n-1  n  n  n  n 


AN1 


for  approximate  aecoad  order  algorithm. 


HI-22 


CHAPTER  IV 


DESIGN  CONCEPTS  FOR  COMPUTER  SYSTEMS  WITH 
PROGRAMMABLE  INPUT  PROCESSING  CAPABILITY 

4.0  INTRODUCTION  -  Input  processing  is  required  in  special  computation 
routines  demanding  a  level  of  computation  capability  higher  than  the  internal 
computer  processes,  which  have  already  been  assigned  the  time  consuming 
computation  routines  associated  with  die  computation  task  of  the  mission. 

The  special  computation  routines  requiring  input  processing  are  character¬ 
ised  by  one  or  both  of  die  following: 

1.  High  frequency  inputs  presenting  rate  handling  and  precision 
problems  to  an  incremental  computer. 

2.  Bulky  data  processing  at  high  rate,  the  structure  of  which  is 
simple  and  repetitive  in  form. 

A  hybrid  GP-DDA  computer  system  without  the  input  processor  mode  is 
capable  of  performing  moderately  demanding  computation  tasks  which  could 
not  be  performed  by  a  less  sophisticated  system.  Many  special  routines  of 
an  application  computation  are  sufficiently  demanding  as  to  require  full 
input  processing  in  addition  to  the  normal  processing  capability  of  die  direct 
GP-DDA  combination.  They  impose  their  input  processing  requirement  as 
a  result  of  a  need  for  special  digital  processing  features  which  (as  a  major 
result  of  this  phase  of  die  contract  study)  are  shown  to  be  largely  attainable 
(without  significant  increase  in  complexity)  by  implementing  share  modal 
action*  of  the  GP-QDDA  system  or  QDDA  computer  to  achieve  input  processing 


♦The  processing  action  resulting  from  essentially  the  same  hardware  com¬ 
plex  used  through  instantaneous  modal  switching  to  effect  modified  multiplier 
bit  length  or  modified  subroutine  iteration  rate. 


IV-l 


functions.  There  are  four  design  problems  involved  in  realizing  programmable 
input  processing  by  share  modal  action.  These  problems  require: 

1.  Multiplier  share  and  modal  switching  action  which  implement 
input  processing  without  significant  increase  in  system 
complexity. 

2.  Programmability  of  input  processing  routine  to  perform  any 
of  a  number  of  input  processing  applications  such  as:  strap* 
down  computations,  midcourse  guidance,  air  data  c omputa - 
tions  during  re-entry  for  ICBM  terminal  guidance,  damping 
and  digital  servo  computations  in  navigation,  digital  auto¬ 
pilot,  fire  control,  and  radar  terrain  picturing. 

3.  Programmable  input  pre-processing  and  extrapolation 
processing  which  characterize  input  processors  of  maximized 
iteration  rate,  and  that  achieve  precision  algorithm.  (They 
present  a  problem  in  only  one  of  the  two  design  types 
developed.  ) 

4.  Communication  and  storage  modifications  for  precision  com¬ 
putation  and  processing  versatility. 

These  problems  will  be  analyzed  separately  in  the  following  discussions  after 
which  results  will  be  combined  to  form  system  configurations  augmented  to 
attain  input  processing  for  the  GP-QDDA  and  QOOA  systems. 

4.  1  ARITHMETIC  CAPABILITY  FOR  INPUT  PROCESSING  IMPLEMENTED 
WITHOUT  SIGNIFICANT  INCREASE  IN  SYSTEM  COMPLEXITY 

A.  GP-QDDA  Computer  System  With  input  Processor  -  The  majority 
of  input  processing  computations  involve  high  frequency  variables 
which  at  moderate  iteration  rates  must  be  executed  with  whole 
word  multiplication  or  many-bit  transfer.  In  the  case  of  the 
GP-QDDA  system,  the  whole  word  multiplier  of  the  GP  can 


basically  supply  this  required  whole  word  multiplication  capacity 
at  adequate  iteration  rate  £or  input  processing  provided  balanced 
time  sharing  is  achieved.  The  most  demanding  application,  i.  e.  , 
strap-down  navigation,  requires  an  estimated  one-fourth  time 
share  of  the  fast  multiplier  services  to  realize  adequate  accuracy; 
other  applications  involving  precision  input  processing  require 
considerably  less. 

The  effectively  lower  iteration  of  the  GP  program  is  more  than 
offset  by  the  reduced  program  task  of  the  GP  as  a  result  of  in¬ 
creased  program  allocation  to  the  QDOA  and  input  processor 
mode.  The  sharing  pattern  adopted  for  the  whole  word  multiplier 
should  have  a  time  schedule  which  requires  minimum  buffering  of 
inputs  for  input  processing  action,  and  minimum  loss  of  GP  pro¬ 
gram  efficiency  resulting  from  share  action.  For  accurate  in¬ 
cremental  integration  algorithm  to  be  obtained  with  minimum 
complexity  it  is  necessary  that  inputs  be  sampled  at  equal  time 
intervals.  The  implementation  of  constant  input  pre-processing 
intervals  and  minimum  buffering  implies  the  adoption  of  a  share 
allocation  of  the  multiplier  unit  consisting  of  evenly  spaced  periods 
of  product  formation  from  inputs  and  programmed  quantities.  Use 
of  a  "slow”  multiplier  would  introduce  preliminary  system  design 
problems  as  a  result  of  the  variable  time  for  instruction  execution. 
In  this  case,  to  ensure  that  the  slow  multiplier  is  available  at 
fixed  word  times,  a  certain  amount  of  programming  inflexibility 
and  consequent  inefficiency  would  be  forced  if  the  interval  of  con¬ 
tinuous  GP  mode  were  very  short  (the  latter  can  be  avoided).  To 
ensure  essentially  unimpaired  GP  program  efficiency  without  pro¬ 
viding  temporary  storage  of  GP  instruction  and  data  words  at  an 
interrupt  time,  the  input  data  would  be  buffered  the  appropriate 


IV-3 


few  word  times  required  to  complete  the  GP  instruction.  There¬ 
after.  the  input  processing  multiplications  would  be  called  and 
executed  with  products  entering  a  buffer  to  the  extrapolation  unit. 
The  rather  expensive  buffer  requirements  in  a  system  with  a  slow 
multiplier  are  obviated  by  assigning  the  system  a  more  expensive 
multiplier,  a  fast  multiplier  such  as  that  in  the  strap-down  com¬ 
puter  fabricated  during  Phase  Q  of  the  program.  The  fast  mul¬ 
tiplier  yields  real  performance  improvement  and  is  therefore 
appropriate  in  the  GP-QDDA  system  with  input  processing 
capability  obtained  by  share  of  the  whole  word  multiplier  unit. 

QDDA  With  Input  Processing  Capability  -  The  extra  multi-transfer 
bit  length  required  for  input  processing  can  be  implemented,  in 
contrast  to  the  previously  discussed  design  approach  by  modal 
action  of  the  QDOA,  without  significantly  adding  to  the  transfer 
hardware  of  the  plain  QODA  not  designed  to  have  the  proposed 
modal  action.  This  result  is  the  general  consequence  of  the  mech¬ 
anisation  of  the  QDOA  with  several  multi -bit  transfer  units.  This 
set  of  multi -transfer  units  can  be  moded  to  achieve  longer  word 
multiplier  by  modal  logic  of  modest  complexity.  No  timing  prob¬ 
lem  exists  in  bringing  in  pre-processed  inputs  to  the  QDPU  or 
distributing  outputs  since  the  allocation  ol  each  QDPU  is  pro¬ 
grammable.  The  inclusion  of  this  programmable  input  processing 
feature  in  the  QDDA  will  enable  the  QDDA  to  handle  any  one  of  the 
word  set  of  input  processing  problems  with  accuracy  consistent 
with  that  of  existing  sensors  associated  with  each  application. 

The  use  of  the  fast  multiplier  of  a  GP-QDDA  system  will  probably 
not  be  necessary  from  the  standpoint  of  input  processing  word 
length  requirement  until  design  breakthroughs  in  sensor  accuracy 
are  made  some  time  in  the  future. 


4.  2  COMMUNICATION  AND  PROGRAMMABILITY  OF  WHOLE  WORD  INPUT 
PROCESSING  IN  THE  GP-QDDA  SYSTEM  -  Programmability  of  GP  and  QDDA 
resolve*  most  problems  of  implementing  programmable  input  processing. 

Those  programmability  problems  which  do  occur  are  for  the  GP-QDDA  system 
which  effects  input  processing  using  the  whole  word  multiplier  of  the  GP. 

These  problems  are  associated  with  input  pre-processing  and  extrapolation 
operations!  which  may  be  implemented  in  relatively  simple  parallel  processing 
loops#  to  achieve  high  precision  integration  algorithm,  at  maximum  real  itera¬ 
tion  rate  of  input  processing,  yet  leaving  adequate  processing  time  for  ordinary 
GP  operation. 

The  problem  of  designing  the  pre-processing  and  extrapolation  processor  loops 
is  complicated  by  the  requirement  to  handle  any  of  a  wide  variety  of  input  proc¬ 
essor  applications  involving  different  numbers  of  inputs,  and  different  numbers 
of  operations  on  each  input.  The  input  processing  routine,  which  is  executed  in 
word  times*,  is  blended  together  with  the  residual  GP  program  in  such 
manner  that  up  to  one -fourth  of  all  word  times  are  allocated  to  the  input  proc¬ 
essing.  This  is  accomplished  as  follows:  The  normal  procedures  of  GP 
programming  are  employed,  the  input  processor  routine  is  programmed  at  the 
start,  then  (4  •  2k  -  Nj)  words  of  ordinary  GP  program,  then  the  input  proc¬ 
essor  routine,  and  so  on  until  the  entire  GP  program  is  complete.  The  pre¬ 
processing  loop  makes  available  a  particular  pre-processed  input  at  a  given 

L 

word  time  at  modulo  2  .  Assuming  the  number  of  inputs  is  *  4,  then  input 
accumulation  and  partial  extrapolation  quantities  can  be  updated  and  stored  in 
six  circulating  lines  of  two  words  on  a  drum  which  make  available  without 
delay  the  pre-processed  inputs  for  subsequent  processing  by  the  fast  multiplier 


4fiere  k  is  the  least  integer  such  that  2^  is  greater  than  the  number  of  words 
in  the  input  processing  routine. 


IV-5 


of  the  CP.  During  each  input  processing  phase  the  multiplier  outputs  are  fed 
directly  to  the  extrapolator  unit.  In  consequence  of  the  relatively  long  period 
until  updating  of  the  next  input  processing,  which  is  more  than  four  times  the 
input  processing  period,  the  extrapolator  unit  need  not  be  constrained  to  have 
its  contents  available  on  short  notice.  Assuming  that  the  shortest  routine 
encountered  in  input  processing  applications  is  eight  word  times,  then  the 
extrapolation  unit  may  take  32  word  times  without  presenting  a  timing  prob¬ 
lem,  an  amenable  relationship  for  input  processings  of  a  32  word  routines 
(strap-down  calculations  have  a  27  word  routine).  The  extrapolator  has  three 
32  word  circulating  lines  which  h*-  e  simple  logic  for  extrapolator  processing 
in  the  same  manner  as  the  strap -down  processor.  The  programming  structure 
of  the  GP  may  be  retained  in  this  system  by  adding  an  input  processing  state 
counter,  which  during  input  processing  periods,  signals  the  transfer  of  pre- 
processed  inputs  to,  and  extrapolator  inputs  from,  the  fast  multiplier  rather 
than  according  to  address.  The  extrapolator  unit  has  outputs  which  are  in¬ 
crements  used  for  updating  the  final  outputs  of  the  input  processing  operation. 
The  problem  of  updating  and  communicating  the  final  input  processing  outputs 
may  be  resolved  by  mechanizing  special  parallel  updating  logic  for  32  words 
of  the  rapid  access  memory.  The  increment  quantity  outputs  from  the 
extrapolator  are  associated  with  these  32  words  and  automatically  update  these 
words  in  fixed  order  during  proper  state  of  the  input  processing  counter.  A 
GP  computer  with  state  of  the  art  word  rates  can  be  given  input  processing 
capability  with  complete  adequacy  at  100to200  it/sec  and  yet  retain  more  than 
75  percent  of  computation  capacity  for  ordinary  GP  operation. 

4.  3  COMMUNICATION  AND  STORAGE  MODIFICATIONS  OF  QDDA  FOR 
PROGRAMMABLE  INPUT  PROCESSING  CAPABILITY  -  The  programmable 
QDDA  with  2M  bit  transfer  for  each  of  the  two  parallel  channels  obtainable 
by  modal  switching  of  pairs  of  M  bit  transfer  units  during  the  first  w  word  timea 
(w  s  16)  generates  multi -increment  outputs  of  2M  bits  representing  first 


IV-6 


differences.  According  to  mechanization  of  register  memory  by  cores  or 
drum  there  must  be  provision  for  the  added  communication  requirements  of 
64M  bits.  Core  registers  require  increased  communication  wiring,  and 
drum  registers  require  additional  rapid  access  memory  bits  (both  by  a  count 
of  64M).  For  M  *  4  the  QDDA  is  capable  of  all  input  processing  functions  in¬ 
cluding  strap-down  navigation  with  accuracy  consistent  with  state  of  the  art 
sensors.  A  multi -increment  QDOA  with  128  QOPU  ordinarily  requiring 
512  bits  of  core  memory  for  the  drum  register  case,  requires  768  bits  for 
programmable  input  processing  capability  in  16  QDPU  with  eight  bit  transfer, 
and  with  a  minimum  of  112  QDPU  with  four  bit  transfer  for  the  remaining 
computation  task  of  the  mission. 


IV -7 


CHAPTER  V 


EVALUATION  OF  AUXILIARY  DDA  DESIGN 
AND  COMPUTATION  FEATURES 


S.O  INTRODUCTION  -  A  number  of  possible  design  features  of  DDA  which 
generally  effect  programming  ease,  computation  versatility  and  capacity  in 
substantial  but  limited  degree,  are  described  and  evaluated.  As  will  be  seen 
the  more  valuable  of  these  auxiliary  features  are  incorporated  in  die  full  scale 
computer  system  design  which  is  the  major  product  of  this  study.  The  design 
features  evaluated  are: 

1.  Va  'iable  (Programmable)  Word  Length 

2.  Output  and  Multi -Input  Scaling  Programmability  (of  the 
type  2*'K,  k  Integral) 

3.  Multi -Input  Quantization 

4.  Decision  Operation 

5.  Communication  Programmability 

* 

6.  Derivary  Communication 

7.  Servo  Operation 

Also,  a  number  of  pertinent  computation  features  are  analysed. 

5. 1  VARIABLE  (PROGRAMMABLE)  WORD  LENGTH  -  Different  portions  of 
the  computation  program  generally  have  markedly  different  accuracy  require¬ 
ments  and  involve  computation  variables  with  markedly  different  maximum 
rates.  A  computer  with  variable  word  length  can  be  programmed  to  handle 
these  computations  with  a  substantial  net  saving  in  program  bit  length,  com¬ 
pared  to  that  of  a  DDA  with  fixed  word  length  (in  which  portions  of  a  majority 

eSee  First  Phase  Technical  Documentary  Report,  pp  15-19. 


V-l 


of  the  registers  are  not  used).  Depending  on  the  application  the  resultant 
iteration  rate  of  the  variable  word  length  DDA  may  be  30  to  75  percent  greater 
than  that  of  the  fixed  word  length  DDA.  It  has  been  found  in  programming  a 
DDA  with  incomplete  communication  (one  integrator  not  being  capable  of  pick¬ 
ing  up  outputs  of  any  chosen  integrator,  but  perhaps  any  one  of  half  the  total 
set)  that  full  exploitation  of  variable  word  length  for  increased  iteration  rate 
is  not  generally  possible,  and  that  perhaps  half  the  gain  is  realized.  There 
are  other  design  conditions  which  can  reduce  the  full  advantage  somewhat, 
such  as  minimum  word  length  imposed  when  parallel  lines  on  a  drum  are  used 
for  storage  of  information.  Onthe  whole,  mechanization  for  variable  word 
length  and  concomitant  advantages  of  input  and  output  scaling  are  among  the 
cheapest  significant  gains  in  computation  capability  of  a  sophisticated 
incremental  computer. 

5.  2  OUTPUT  AND  MULTI-INPUT  SCALING  PROGRAMMABILITY  - 

DDA  computation  accuracy  is  highest  when  the  maximum  output  rate  of  the  R 

register  approaches  unity.  It  is  therefore  of  very  real  value  to  be  able  to 

adjust  the  general  order  of  magnitude  of  computation  variables  by  scaling  in 

a  simple  manner  which  does  not  require  scaling  integrators.  Since  in  a  delay 

line  containing  a  binary  number  an  added  k  bit  delay  is  equivalent  to  a  2k 

scale  change*  the  mechanism  for  achieving  input  and  output  scale  changes  is 

relatively  simple.  Output  scaling  has,  from  the  standpoint  of  accuracy  of  the 

output,  only  one  best  scale,  however,  since  there  would  be  occasions  when 

1. 

input  scaling  cannot  be  arranged  to  do  the  whole  2  scaling  of  variables  it  can 
have  some  practical  value  in  cases  where  the  input  scaling  range  is  limited. 
For  variable  word  length  DDA  the  output  scaling  is  free.  Since  a  single  output 
can  be  required  at  a  number  of  inputs  with  different  scales  it  is  clear  that 
adequate  2*  scaling  cannot  be  done  simply  with  output  scaling  but  also  requires 
input  scaling.  The  degree  and  kind  of  input  scaling  capability  can  have  sub¬ 
stantial  effect  on  the  ultimate  computation  capacity  obtained  through  increased 


programming  versatility  (up  to  a  sharp  limit).  However,  mechanization  com¬ 
plexity  could  be  the  negative  factor.  Programming  versatility  for  most 
applications  essentially  reaches  the  upper  limit  by  providing  for  about  five 
independent  inputs  to  a  register  with  independent  scales.  A  scaling  range  of 
2°  to  28  does  not  make  all  programs  directly  programmable  but  with  program¬ 
ming  ingenuity  most  problems  may  be  programmed  within  this  degree  of  flex¬ 
ibility  without  loss  of  program  efficiency  or  computation  accuracy.  The 
mechanization  of  multi -input  pickup  and  accumulation  which  directly  uses  the 
delay-scale  property  is  simplest  if  all  inputs  have  a  different  scale  since,  for 
single  increment  communication  only  one  adder  is  required.  The  capability 
of  handling  two  inputs  of  the  same  scale  (and  afterwards  inputs  of  at  most  two 
of  the  same  scale)  appears,  however,  necessary  as  well  as  sufficient  for 
adequate  programming  flexibility. 

The  operation  of  input  accumulation  for  y  registers  must  be  carried  out  con¬ 
tinuously  to  the  end  of  the  word  (in  order  to  accomplish  full  updating  with 
possible  non-zero  carry  to  the  most  significant  end  of  the  register).  Thus  a 
DDA  with  more  than  one  y  register  must  have  individual  updating  arithmetic 
units  for  each  y  register.  Two  parallel  DDA  computers  with  equal  computa¬ 
tion  capacity,  but  with  a  different  number  of  y  registers,  involve  different 
costs  in  providing  required  updating  arithmetic  units.  For  a  given  level  of 
computation  capacity,  the  fewer  the  number  of  y  registers  the  better.  The 
second  feature  of  multi -input  processing  associated  with  algorithm  is  dis¬ 
cussed  in  the  next  section. 

S.  3  MULTI-INPUT  QUANTIZATION  FOR  SINGLE  OR  MULTI -TRANSFER  - 
A  DOA  may  be  designed  to  have  multi -input  programmability  of  (independent 
variable)  AX  registers  (used  for  single  or  multi -transfer  control).  The  design 
problem  of  providing  for  multi -inputs  to  'X  registers  differs  from  that  for 
y  registers  in  an  important  way.  A  difference  arises  from  the  general  fact 


Vo 


that  AX  registers  may  be  short  as  a  result  of  few  bit  multi-transfer  (in  the  case 
of  simplest  transfer  mechanization  the  amount  shorter  may  be  the  transferred 
y  word  length).  Inputs  are  necessarily  scaled  to  be  consistent  with  limited 
transfer  capability.  Since  AX  registers  are  short  their  full  updating  could  be 
executed  usually  in  a  fraction  of  a  word  time,  implying  that  a  single  updating 
arithmetic  unit  can,  in  principle,  (by  serial  sub-word  operation)  update  several 
AX  registers  in  a  computer  designed  to  execute  a  number  of  transfer  operations 
in  parallel.  In  this  respect  updating  AX  registers  may  be  economical  in  a  ODA 
with  sophisticated  processing  capability.  A  second  feature  of  multi -input 
processing  for  a  register  is  the  requirement  for  a  quantization  operation  i«e», 
roundoff  operation.  In  the  case  of  AX  variables  the  limited  multi -transfer 
capability  provided  for  by  the  mechanization  requires  that  accumulated  inputs 
when  used  for  transfer  control  be  rounded  off.  In  the  case  of  y  variables  the 
phasing  of  transfer  start  (depending  on  mechanization)  in  effecting  integration 
algorithm  may  require  a  roundoff  operation  in  generating  transferred  variables. 
Roundoff  operations  must  be  selected  to  remove  bias.  For  ternary  or  ordinary 
multi -inc  rement  communication  the  subsignificant  register  quantity  for  multi - 
inputs  subjected  to  roundoff  may  be  initialized  (at  one-half)  in  the  conventional 
manner  for  a  ternary  R  register.  The  extreme  shortness  of  multi-input  regis¬ 
ters  requires  that  a  further  refinement  be  mechanized  since  such  a  register  is 
biased  to  the  extent  of  parts  of  the  maximum  content  where  i~  is  the 

scale  of  the  lowest  scale  input.  Using  the  sophisticated  roundoff  operation 
(simply)  mechanized  in  the  strap-down  processor  for  input  accumulator  outputs 
to  the  multiplier,  the  bias  is  removed. 

5.  4  DECISION  OPERATIONS  -  The  essential  decision  capabilities  recognized 
for  DDA  design  are  accepted  here  as  well  as  elaborations  important  for  extended 
DDA  application.  Their  implementation  in  the  full  scale  computer  system  with  a 
remarkably  new  processing  structure  is  described  in  Chapters  X  and  XI. 


V-4 


5.  5  COMMUNICATION  PROGRAMMABILITY  >  Program  versatility  of  a  DOA 
is  reduced  if  the  level  of  communication  programmability  is  not  sufficiently 
high  (despite  use  of  ingenuity  by  the  experienced  programmer).  A  computer  for 
a  full  aerospace  mission  may  require  more  than  four  or  five  times  the  program 
of  a  computer  for  a  primarily  airborne  inertial  navigation  function.  Moreover, 
the  majority  of  aerospace  program  subroutines  may  require  a  high  degree  of 
intercommunication.  This  would  appear  to  imply  that  a  computer  for  a  full 
aerospace  mission  should  have  a  higher  degree  of  communication  program¬ 
mability  than  the  conventional  ODA.  It  will  be  shown  in  later  chapters  that  the 
design  approach  developed  for  the  full  scale  aerospace  computer  with  multi  - 
increment  computation  leads  to  a  reduced  number  of  computer  outputs  (by  a 
factor  of  tvo)  which  are  single  increment  rather  than  multi -increment.  Conse¬ 
quently  the  mechanisation  required  to  make  outputs  available  for  rapid  access 
as  inputs  is  not  only  simpler  than  that  of  a  multi -increment  DDA  but  also  sim¬ 
pler  than  that  of  single  increment  ODA.  A  net  advantage  in  simplicity  of  total 
communication  structure  for  full  communication  of  the  full  scale  aerospace 
computer(  relative  to  the  conventional  DOA  with  equal  program,  is  retained 
after  further  taking  into  account  that  three  (or  four)  input  variables  rather  than 
two  input  variables  are  allocated  component  input  variables,  and  that  the  total 
number  which  must  be  provided  is  comfortably  the  same  (six).  Communication 
structure  for  partial  communication,  such  as  using  z  lines  on  a  drum,  appears 
somewhat  simpler  for  the  conventional  DDA  but  inflicts  a  program  capacity 
reduction  which  is  significant  especially  for  a  variable  word  length  DDA.  In  the 
case  of  a  sophisticated  incremental  computer  where  outputs  are  collected  in  a 
rapid  access  memory,  and  selected  by  drum  stored  input  selection  words, 
certainly  if  some  cramping  in  drum  storage  of  input  selection  word  set  appeared 
economical,  a  certain  fraction  of  the  total  number  of  words  in  the  set  for  the 
generalised  integrator  could  be  shortened  without  appreciable  communication 
loss.  Since  adequate  storage  space  (assuming  drum  memory),  is  available 


because  of  a  provision  for  a  set  of  short  registers,  the  latter  need  not  be  re¬ 
sorted  to  in  the  proposed  aerospace  computer. 

5.  6  DERIVARY  COMMUNICATION  -  A  new  type  of  communication  (termed 
derivary)  using  the  ternary  set  +1,  -1,  -0  was  proposed  during  Phase  I  as  a 
means  of  communicating  second  difference  information  for  higher  order 
integration  without  actually  increasing  the  bits  communicated.  Since  second 
order  algorithms  involve  the  factor  one -third  it  was  seen  that  this  factor 
(unamenable  to  delay-scale  mechanization)  could  be  effectively  made  on  second 
differences  by  counting  modulo  three  and  communicating  in  binary  i  or  •  when 
the  first  difference  ternary  is  zero.  The  primary  value  of  second  order 
algorithm  has  been  shown,  however,  to  arise  in  high  frequency  variable 
computation  with  multi-increment  accuracy,  the  latter  because  roundoff 
effects  are  reduced  to  below  second  order  algorithm  magnitude  levels.  The 
derivary  communication  concept  can  be  applied  (if  required  accuracy  warrants) 
to  a  multi-increment  computer  of  the  revolutionary  type  discussed  in  latter 
chapters  and  which  uses  second  difference  communication  of  ternary  type. 

In  this  case  the  utilization  of  the  information  space  when  the  increment  is 
zero  provided  by  +  and  -  selection,  may  communicate  second  differences  with 
scale  one-third  for  the  integration  algorithm. 

5.  7  PHILOSOPHY  OF  INTEGRATION  AND  QUOTIENT  ALGORITHM  PROC¬ 
ESSES  IN  RELATION  TO  SERVO  ACTION  IN  GENERAL  DDA  COMPUTATION  - 
An  investigation  was  initiated  to  evaluate  the  degree  of  generality  solely*  of 
incremental  integration  and  quotient  algorithm  processes  alone,  for  computa¬ 
tion  tasks  other  than  slewing  typically  assigned  to  an  incremental  computer,  and 
tasks  envisioned  in  future  applications.  The  conventional  DDA  with  codable 
servo  operation  is  the  excepted  case  in  point,  for  the  general  impression  has 

*Decision  functions  however  being  incorporated  in  non-analytical  processes 
of  switching. 


V-6 


existed  on  the  part  of  many  that  high  precision  is  obtainable  in  some  computa¬ 
tions  using  implicit  computation  with  servo  loops*.  At  this  point,  the  relation¬ 
ship  of  general  servo  mechanism  function  to  direct  computation  function  ia  one 
formative  basis  in  evaluating  the  alternative  designs.  Generally,  the  direct 
computation  function  if  precisely  executed  leads  to  accurate  results,  but  if 
subject  to  inexact  execution  through  integration  algorithm  error  or  roundoff 
error,  it  can  lead  to  inaccurate  non-self>correcting  results.  In  general,  the 
servo  mechanism  function  is  ideal  to  insure  the  reduction  of  large  induced 
errors,  assuming  the  errors  have  some  magnitude  bounds,  but  this  is  accom¬ 
plished  at  the  price  of  tolerating  a  certain  range  or  dead  zone  of  error  inherent 
to  servomechanism  action.  The  largest  computation  execution  errors  in  whole 
word  incremental  computations  are  typically  those  of  integration  algorithms. 
Conventional  DDA  design  is  such  that  roundoff  error  is  equally  as  important 
as  integration  algorithm  error.  With  the  exception  of  applications  basically 
requiring  decision  modes,  simulations  have  shown  that  high  precision  is 
uniformly  attained  in  sophisticated  DDA  systems,  without  servo  elements, 
thus  avoiding  limitations  in  computation  accuracy  implied  by  servo  mechanism 
computation.  A  special  error  source  of  the  servo  despite  multi-input  scaling 
is  the  effective  lag  produced  by  chance  superposition  of  "l"s.  Though  this 
lag  source  is  reduced  by  reducing  scale  to  extremely  small  values  the  result 
is  noisy  servo  action. 

5.  8  WHOLE  WORD  DERIVATIVE  COMPUTATION  IN  A  DDA  WHICH  IS  LAG 
FREE  -  A  variable  x  is  updated  in  a  DDA  with  single  or  few  bit  increments  Ax 
which  represent  the  true  derivative  of  x.  An  application  may  require  a  com- 

Q 

puter  output  of  a  whole  word  representation  of  the  derivative  x  in  which  case 
the  Ax  form  is  Inadequate  from  the  standpoint  of  accuracy. 


^Systems  of  integrators  involving  at  least  one  operational  integrator,  i.e. 
servo. 


V-7 


The  conventional  methods  of  generating  a  whole  word  x  are  based  on  linear 
smoothing  methods  involving  pure  integration  and  in  addition  alternatively 
using  a  servo  element.  The  general  impression  that  the  former  method 
must  involve  error  as  a  result  of  a  lag  for  band  limited  inputs  is  the  result 
of  use  of  unsophisticated  smoothing  computations,  and  will  be  shown  to  be  a 
false  one.  Thus  in  effect  it  will  also  be  shown  that  there  is  no  natural  limita¬ 
tion  of  integration  methods  in  a  OOA  of  this  type.  The  importance  of  lag 
effects  in  a  computer  output  variable  cannot  be  overstated,  as  for  ex¬ 
ample,  in  the  applications  to  craft  control  where  instability  can  result.  Thus, 
in  general,  it  is  better  to  tolerate  a  higher  noise  level  than  a  lag  in  a  variable 
in  certain  critical  cases.  Through  design  effecting  higher  iteration  rate  and 
multi -increment  computation,  the  noise  level  is  greatly  reduced.  Thus 
the  development  of  a  lag  free  derivative  computation  complements  overall 
performance  improvement  for  such  applications.  There  are  applications  in 
which  the  rates  of  variation  of  derivative  variables  tothe  output  are  so  pro¬ 
nounced  that  the  freeness  of  lag  implied  by  system  requirements  must  be  such 
that  a  locally  quadratic  x  can  be  transmitted  without  lag  in  contrast  to  the  less 
demanding  case  of  a  locally  linear  x.  Consider  an  attack  on  this  demanding 
problem  through  analysis  using  continuous  linear  smoothing  theory.  Con¬ 
ventional  first  order  smoothing  is  given  by 

J  =  f  K  -K(t  ‘  K)  x  d  H  (V-l) 

« 

0  o 

which  may  be  shown  for  x  =  (x)  +  (x)  t  to  introduce  a  lag  of  time 

0  o 

l/K  in  the  x  generated.  Computation  of  unlagged  x(t)  through  use  of  past 
history  properties  at  time  x  in  die  case  where  no  smoothing  is  required  would 
be  possible  by  Taylor  series,  thus 

o  o  oo  ooo.  .  (t  -  x>» 

x(t)  =  x  (x)  +  x(x)(t-x)+  x  (x)  l (V-2) 


V-8 


Where  smoothing  ie  obviously  required  over  a  wide  range  of  times  this 
suggests  a  general  computation  form  which  is  unlagged  for  locally  quadratic 
x,  namely 


% 

o 

X 


=  <*>  +  & (*)  "  *0  +°^ (*)  ~  (*  -  *)  d* 


(V-3) 


where  the  weighting  function  g  must  satisfy 


g  (x)  dx  > 


1 


since  substituting  for  the  truncated  Taylor  series,  x(t)  yields 


i 


&  *  J  X  (t)  g  (t  -  >)  d*  *  &  (t)  I  g  (x)  dx 


m 

/. 


(V-4) 


and  we  require  $c  (t)  =  &  (t).  Tne  actual  information  available  is  £  x  (*), 
hence  the  linear  smoothing  form  will  be  expressed  in  terms  of  x  (*)  by  trans¬ 
formation  obtained  by  successive  integration  by  parts,  namely 


•Kx 

Consider  the  selection  of  g  (x)  *  Ke  for  the  weighing  function  in  the  second 
order  lag  free  computation  form. 


It  is  shown  by  direct  differentiation  of  the  terms  in  the  integrand  for  this  case 
that 


l .  /k.-151'  -  ">  [3  .  3K«,  -  .)  ♦  ]&  d> 


(V-6) 


V-9 


The  DDA  will  effect  the  computation  by  solving  the  differential  equation 
obtained  from  the  integral  form.  To  obtain  the  computation  in  a  set  of  first 
order  differential  equations  each  one  of  which  is  effected  by  a  single  integra¬ 
tion,  differentiate  the  integral  form  of  x  as  follows 

‘  3K*eKt]  *  ®KKk[-3K  +  K*(t  ‘  *>]*»*]  (V-7) 


to  obtain  the  relation 


A[(f  ,  K(S-  A)), 


Kt 


-3K* 


xe 


K*  / 


Kx° 
e  Kxd* 


(V-8) 


denoting 

»  *  star4**-3*5) 

(V-9) 

and  using 

i  -  K /e'K(t  ■  H>idH 

(V-10) 

mm 

£  * 

(V-ll) 

(V-12) 

From  the  definition  v  a  third  differential  equation  ie  obtained.  Thus  x  can 
be  computed  with  three  integrators  each  solving  one  of  the  differential  equa¬ 
tions 

,v-13’ 


V-10 


£  -  k[£  -  3&  -  vj  (V- 14) 

* 

~  -  k[v  -  x  +  3x]  (V-15) 

which  in  difference  equation  form  are  approximated  by 

p  0*1 

d(*KT)  *  (Kt)|Ax*  -  xh1tJ  (V-16) 

Aiv^)  s  (KT)[tT  -  3dxR  -  vH  itJ  (V-17) 

M  *> 

*  (KtJ^T  .  x^T  +  3AxJ  (V  -18) 


o 

which  with  refined  algorithm  yield  a  lag  free  whole  word  derivative  x.  The 
price  of  removing  lage  in  the  derivative  estimate  is  some  increase  in  noise. 

It  may  be  shown  that  the  factor  of  increase  relative  to  the  first  order  computa¬ 
tion  is  given  by 


Noise  ^ 

_ I 


°"Noise5* 

x 


Thus  the  noise  error  of  £c  i*  twice  that  of  &,  a  surprisingly  small  increase  in 
view  of  removal  of  the  much  more  critical  lag  effects. 


V-ll 


CHAPTER  VI 


DEVELOPMENT  OF  SERIAL- PARALLEL  DDA  MECHANIZATIONS  WITH 
HIGH  DUTY  FACTORS  WHICH  ARE  CAPABLE  OF  QUOTIENT  ALGORITHM 
WITH  DERIVED  (SINGLE  INCREMENT)  TERNARY  AND  LATER 
DEVELOPED  MULTI- INCREMENT  COMPUTATION 


6.  0  INTRODUCTION  -  The  demanding  computation  capability  requirements 
for  aerospace  applications  established  in  application  studies  and  computer 
type  computation  capability  analyses  (See  Chapter  XI)  demonstrated  that  two 
types  of  computatior  (a)  input  processing  and  (b)  quotient  operations  in 
internal  computation  present  overwhelming  computation  tasks  for  conventional 
DDA  mechanization.  *  The  necessity  for  and  the  nature  of  special  design 
features  for  type  (a)  was  established  in  Phase  I  of  the  study.  The  full  sig¬ 
nificance  of  type  (b)  was  not  evolved  until  Phase  II  during  which  aerospace 
applications  were  analyzed  in  a  detailed  study  and  it  was  established  that 
there  were  severe  accuracy  limitations  of  conventional  DDA  mechanizations 
in  executing  quotient  type  computations  which  incontrovertibly  occur  in 
aerospace  applications.  The  accuracy  limitations  were  demonstrated  in  both 
analyses  and  simulations.  The  historical  role  of  quotient  type  computations 
is  discussed  more  fully  in  the  next  section.  The  conclusion  based  on  analysis 
and  simulation  was  that  an  aerospace  computer  for  a  full  mission  must  have 
radically  different  mechanisation  for  even  internal  computations  in  conse¬ 
quence  of  quotient  computation  requirements.  As  will  be  seen,  the  ultimate 
design  developed  for  a  full  scale  computer  system  which  executes  input  pro¬ 
cessing  ami  Internal  computations  together  in  a  highly  efficient  mechanization 
did  not  reflect  an  overall  mechanisation  complexity  substantially  greater  than 
that  required  for  input  processing.  As  a  result  the  developments  in  increased 


*  At  conventional  or  relatively  high  iteration  rates. 


VI- 1 


internal  computation  capability  made  possible  by  highly  unconventional  pro¬ 
cessings  developed  in  the  chapter  on  multi-increment  QODA  are  essentially 
gains  without  genuine  overall  mechanization  cost  given  input  processing  is 
established  to  be  necessary.  In  the  analyse  of  this  chapter  the  balance  of 
factors  of  computation  capability  and  mechanization  complexity  are  examined 
in  detail  for  the  internal  computation.  The  analysis  is  carried  out  for  the 
most  part  as  for  a  separate  computer  leading  to  an  optimized  design  of  such 
a  subsystem.  Since  many  of  the  mechanization  (rather  than  computation) 
factors  leading  to  the  design  are  compatible  with  optimized  input  processin" 
subsystem  design,  the  development  leads  to  a  full  scale  fully  integrated  com¬ 
puter  system  actually  much  more  efficient  than  either  of  the  subsystems 
comprising  it.  Analyses  of  this  chapter  have  the  goal  of  developing  internal 
computation  mechanizations  of  high  duty  factor  i.  e. ,  the  percent  of  time  of 
full  use  of  basic  arithmetric  capability,  as  one  part  of  the  full  scale  computer 
development. 

6.  1  THE  HISTORICAL  AND  FUTURE  ROLE  OF  DIVISION  ALGORITHM  DDA 
IN  AIRBORNE  AND  AEROSPACE  APPLICATIONS  -  This  report  documents-’ 
the  many  factors  implying  that  a  DDA  have  division  algorithm  for  full  aerospace 
mission.  To  see  the  historical  role  of  the  algorithm  selection  rational  con¬ 
sider  the  airborne  applications  for  low  speed  aircraft.  The  conventional  DDA 
for  real  time  computation  does  not  have  division  algorithm  despite  the  fact 
that  the  design  technique^has  been  known  since  1954  for  binary  DDA  vith 
division  algorithm.  A  partial  explanation  derives  from  the  fact  that  the 
majority  of  airborne  DDA  computers  are  designed  for  conventional  airborne 
pure  inertial  navigation  of  low  speed  craft  with  limited  altitude  range.  The 

*See  Computation  Capability  Analysis  (Chapter  X);  and  mechanization  analyses 
applying  to  computer  applications  requiring  parallel  processing  capability. 

’■"•‘S.  Cray,  Remington  Rand 


VI-2 


only  occurrence  of  a  computation  formally  involving  division  in  the  pertinent 
equations  involves  division  by  radial  distance  from  earth  center.  For  limited 
altitude  range  of  the  craft  the  division  may  be  avoided  by  a  numerical  ap¬ 
proximation*  of  acceptable  accuracy  for  typical  pure  inertial  navigation  appli¬ 
cations  of  the  past.  Now  that  doppler  radar  has  been  developed  to  enable  high 
accuracy  long  term  navigation  for  low  speed  craft  it  should  be  pointed  out  that 
lack  of  division  algorithm  in  DDA  for  conventional  airborne  navigation  will  be 
recognized  as  unfortunate  since  in  the  damping  a  division  calculation  is  in¬ 
volved  in  transformation  of  doppler  velocities  from  craft  to  inertial  coordinates 
which  involves  relatively  high  frequency  variables  and  in  consequence  makes 
division  by  integration  or  servo  processes  unacceptably  inaccurate.  **Before 
discussing  the  applications  other  than  low  velocity  craft  navigation  which  quite 
clearly  require  division  algorithm,  consider  what  the  hardware  and  perform¬ 
ance  trade  off  is  for  conventional  airborne  navigation  with  conventional  and 
division  algorithm  (where  division  is  not  formally  required).  Division  algo¬ 
rithm  and  single  increment  communication  imply  with  drum  memory,  in  the 
most  elementary  case,  one  additional  channel  (say  from  four  or  five  to  five 
or  six  channels)  for  storage  of  the  divisor  (in  which  case  digital  processing 
unit  iteration  rate  equals  word  rate).  Also  an  additional  transfer  unit  and 
modest  amount  of  overflow  logic  must  be  used;  communication  lines  must  be 
elaborated  slightly.  While  little  increase  in  computer  size  results  in  the 
drum  memory  case  the  logical  complexity  is  somewhat  increased.  Consider 
now  the  value  (which  we  point  out  exists)  of  division  algorithm  in  a  computer 
used  for  the  hypothesized  application  in  which  division  is  not  required  at  all. 


*Not  good  for  aerospace  missiles 

**See  doppler  damping  program  and  computation  analysis  studies 


VI- 3 


One  of  the  more  frequent  uses*  of  division  algorithm  in  any  application  ie  that 
of  whole  word  scaling  which  occurs  in  programming  25  or  30  percent  as  often 
as  general  integration.  Since  division  algorithm  provides  parallel  generation 
of  genera1  integration  and  whole  word  scaling,  an  average  increase  in  com¬ 
putation  capacity  of  25  to  30  percent  results  in  applications  where  division  is 
not  formally  required.  One  reason  this  reasonably  priced  modest  increase 
in  performance  has  not  been  chosen  where  division  is  not  required  is  probably 
the  trend  toward  ternary  communication  and  computation  (with  twice  binary 
precision).  No  ternary  division  algorithm  was  known**  until  developed  in  this 
contract  study.  There  is  also  the  consideration  that  a  parallel  (two  integra¬ 
tions  per  word  time)  ODA  with  or  without  division  algorithm  requires  only 
one  additional  channel  over  the  elaborated  DDA  for  R  register  in  the  drum 
memory  case  to  double  computation  capacity  where  no  division  is  required. 
Taking  into  account  that  doppler  damping  inertial  navigation  requires  division 
the  modest  increase  in  complexity  in  a  DDA  with  ternary  division  algorithm 
is  seen  to  offer  a  substantial  auxiliary  increase  in  computation  capacity  apart 
from  the  single  really  essential  division  operation.  It  is  therefore  concluded 
that  the  long  term  navigator  using  doppler  damping  for  low  speed  craft  should 
in  future  systems  have  a  DDA  with  ternary  division  algorithm  (or,  as  will  be 
seen,  have  digital  Stieltjes  algorithm  developed  in  Chapter  DC). 

Aerospace  applications  very  definitely  require  division  algorithm  for  inertial 
navigation  at  near  orbital  velocities  because  the  variation  of  radial  distance 
from  earth  center  which  is  prominent  as  the  divisor  in  the  navigation  equations 
is  substantial  and  has  relatively  rapid  change.  A  host  of  other  computations 


*  The  operation  Kdx  is  accomplished  by /division  by 


2 


-1 


**To  our  knowledge 


VI-4 


such  »•  orbital  coordinate  to  earth  coordinate  computations  and  re-entry 
computations  definitely  require  precision  division  algorithm. 

6.  2  HISTORICAL  DEVELOPMENTS  IN  BINARY  DIVISION  ALGORITHM 
DESIGN  TECHNIQUES  AND  THE  PURPOSE  FOR  THE  DEVELOPMENT  OF 
A  TERNARY  DIVISION  ALGORITHM  COMPUTER  -  In  a  previous  incremental 
computer  design,* a  basic  processing  unit  or  generalized  DDA  "integrator"  was 
devised  which  could  binary  increment  a  reciprocal  in  one-word  time  in  a  com¬ 
puter,  with  binary  communication.  There  is  an  equivalent  of  the  R  register 
of  the  conventional  integrator  in  the  modified  baeic  processing  unit.  One  major 
difference  between  this  previous  unit  and  the  one  developed  in  this  study  is  in 
the  nature  of  the  R  register.  Whereas  the  former  employed  a  double  length  R 
register  which  had  to  be  stored  in  rapid  access  memory  at  a  considerable  in¬ 
crease  in  machine  complexity,  fits  presently  conceived  device  has  a  single 
word  R  register  as  does  the  conventional  R  register  of  an  integrator  which  may  be 
stored  cheaply  along  with  other  register  quantities  on  a  drum.  The  binary 
computation  feature  must  also  be  considered  a  limitation  since  (though  binary 
is  the  cheapest  in  mechanisation)  computations  in  binary  are  subject  to  the 
"phase"  effect  associated  with  the  representation  of  aero  with  a  stream  of 
alternating  +1  and  - 1  values;  also  the  resolution  obtainable  in  binary  com¬ 
putation  is  one-half  that  obtained  in  ternary  computation.  For  these  reasons 
the  trend  in  conventional  airborne  computers  without  quotient  algorithm  is 
toward  ternary  design.  Because  quotient  computation  is  basically  more 
sensitive  to  roundoff  error  than  other  Incremental  computations,  though  less 
so  in  a  quotient  algorithm  machine  than  a  conventional  DDA,  it  is  expected 


eg,  Cray,  Remington  Rand.  Contract  No.  AF  33  (038-23287). 


VI-5 


that  a  ternary  quotient  algorithm  computer  would  have  much  higher  accuracy 

than  the  binary  quotient  algorithm  computer.  A  binary  atream  of  information, 

lacking  precise  representation  of  zero  does  not  appear  to  generalize  directly  to  a 

whole  word  binary  number  (despite  the  nomenclature)  as  indicated  by  adding 

bits  to  the  short  word  communicated.  Thus  the  development  of  multi 'increment 

quotient  algorithm  for  DDA  type  computers  for  higher  levels  of  accuracy  and 

rate  handling  capability  does  not  appear  to  be  a  direct  step  from  the  binary 

quotient  algorithm.  A  novel  development  in  rate  handling  capability  but  not  in 

accuracy  was  however  demonstrated  in  an  incremental  computer  which  utilized 

a  "variable"  increment.  Here  a  several  bit  word  communicated  represents  a 

-K 

binary  number  of  magnitude  2  where  K,  integral,  is  defined  by  the  word. 

Clearly  a  very  wide  range  of  rates  can  be  handled  for  a  sufficiently  large  range 
of  K.  However,  the  accuracy  of  representation  of  any  rate  is  nevertheless  that 
of  binary  single  increment.  Basic  arithmetic  accuracy  can  be  obtained  only  by 
pure  multi -increment  or  a  combination  of  multi -increment  and  variable  increment. 
Effort  in  this  study  was  therefore  made  in  the  direction  of  developing  ternary 
quotient  algorithm  since  the  further  development  of  multi-increment  quotient 
computation  appeared  to  be  a  generalization  which  later  analyses  could  fully 
exploit. 

6.  3  DEVELOPMENT  OF  TERNARY  QUOTIENT  ALGORITHM 

A.  General  Incremental  Computation  of  Quotient  Without  Explicit 
Division  Operations  -  The  arithmetic  unit  of  a  DDA  is  capable  of 


VI-6 


transfer  operations  (single  or  multi-increment  multiplications)  but 
not  of  direct  division.  The  quotient  algorithm  must  consist  of  an 
incremental  computation  directly  involving  only  addition,  subtrac¬ 
tion  and  multiplication.  While  approximate  numerical  incremental 
quotient  algorithms  have  been  known  since  1954,  it  is  believed  that 
the  first  algorithms  generally  good  to  second  order  accuracy  are 
those  derived  in  the  chapters  presenting  developments  in  the  general 
theory  of  incremental  computation.  Since  the  analytical  develop¬ 
ments  there  hypothesise  the  form  (to  within  algorithm  refinements 
for  high  accuracy)  of  the  arithmetic  process  used  in  earlier  mech¬ 
anisations,  it  may  be  informative  to  present  a  more  brief  heuristic 
(and  less  general)  discussion  of  a  quotient  process.  The  general 
features  of  R  register  inputs  will  be  delineated.  Consider  the  re¬ 
ciprocal  computation 

9  *  1/1  (VI-1) 

by  incremental  processes.  Since 

d6  *  -dl/T  (VI-2) 

and  division  is  to  be  avoided,  the  alternative  form 

d6  *  -e.fdl  (VI-3) 

is  one  direct  incremental  relation  capable  of  ODA  computation 
provided  two  integrators  are  used.  Double  integration  incrementa¬ 
tion  requires  essentially  two  word  times  one  for  each  serial  sum¬ 
mation,  whether  by  ordinary  DDA  integrators  or  by  a  single  ela¬ 
borated  unit  with  two  summation  registers  and  a  single  R  register 
for  output  generation.  Such  a  quotient  algorithm  unit  of  this  direct 
type: 


VI-7 


1. 


Require*  two  word  time*  per  incrementation. 

Z.  Involve*  a  single  R  register  rather  than  two  as  in  an  ordinary 
DDA,  and  should  obtain  some  reduction  in  roundoff  error 
effects. 


Consider  now  an  approach  which  requires  one  word  time  per  in¬ 
crementation.  The  alternative  difference  form 

0  =  I  A0  +8  hi  (VI-4) 

n  n  n-1  n 


is  exact  for  9  =  1/1  . 

n  n 

* 

Since  the  object  is  to  compute  AO  .  and  unless  the  equation  were 

n 

solved  for  A0  which  involves  division  by  I  ,  a  preknowledge  of  the 
n  n 

answer  is  needed  to  satisfy  the  equation.  Before  resolving  this 
apparent  difficulty  consider  the  practical  method  of  holding  a 
quantity  to  a  minimum  absolute  value  over  periods  of  time.  The 
classical  approach  of  residue  retention  in  this  case  would  ap¬ 
parently  imply  computation  of  an  R  (register)  quantity  given  by 


R 

n 


R 


n-1 


♦  I  A0 
n  n 


+  e 


n-1 


M 

n  • 


(VI-5) 


The  choice  of  Ao  which  holds  R  to  a  minimum  absolute  value  is 
n  n 

the  estimate  of  the  reciprocal  increment  used.  Consider  the  case 

of  binary  representation  of  AQq-  Since  only  two  possible  values  of 

A«  are  acceptable  for  computation  the  problem  of  holding  R  to 
n  n 

minimum  value  under  these  conditions  can  be  approached  by  as¬ 
suming  in  two  separate  calculations  of  R  that  ZQ  =  -1  and 

it  n 

A6  ♦  -1.  The  lesser  R  which  results  indicates  which  A©  is  the 
n  n  n 

preferable  choice.  In  an  ordinary  DDA  the  R  register  value  is 
pertinent,  after  overflow,  in  judging  the  roundoff  error.  Since 


*where  the  nth  output  from  the  R  register.  In  this  particular 

case.  JD  =  AO 
n  n 


VI-8 


nothing  happens  from  the  time  after  overflow  to  the  next  iteration 
incrementation  the  value  of  R  just  prior  to  the  next  incremental 
operation  may  be  considered  the  pertinent  one.  In  the  divieion 
process  if  no  effort  were  made  to  get  the  R  register  in  the  ideal 
state  during  the  iteration  considered,  but  rather  to  simply  deter¬ 
mine  the  best  choice  of  AO  which  is  made  to  be  the  overflow,  then 

B 

the  adjustment  to  get  the  R  register  into  the  proper  state  can  be 
made  just  prior  to  incrementation  without  palpable  effect  in  the 
results  computed.  Note,  that  for  R  .  (corrected  as  explained 
above)  the  R  register  would  contain  (introducing  a  pseudo  variable 
R  *): 

n 


*!  ■  *.-i  ♦  '.-i41.  («-‘i 

provided  I  o  AO.  were  not  addod.  S  R*  >  0  and  I  >  0  then, 
s  a  bo 

certainly,  AO  ■  -1  weald  make  R  smaller  in  absolute  value  than 
B  II 

AO.  •  +1.  In  general  AO.  ■  -sg  R*  sg  I  is  the  proper  choice 

B  B  B  B 

and  may  be  formed  with  simple  logic  using  R*  and  I  .  Having  the 

B  B 

correct  AO  the  R  register  can  be  corrected  for  the  term  I  o  AO. 
just  prior  to  the  next  iteration  to  obtain  (with  relabeling  so  as  to 
consider  the  next  iteration  as  the  a *A) 


1.  i  ■  R*  »  ♦  I  .  A§  . 

fl*  l  B«1  B*  1  B»  I 


(VI- 7) 


The  final  arithmetic  operation  (actually  carriod  out  at  die  same 

time)  is  the  addition  of  O  .  At. ,  hones  the  actual  (rather  than 

a- i  a 

minimal  residual)  is 

K  ■  NJ-l  *  #n-l  *  Ltm  *  ^-1  *  Aln-i  (VI-A) 


*  Where  t  is  the  algorithm  form  of  the  divisor  term  in  the  formal  residue 
Incrementation  formula. 


Vl-9 


where  AO  selection  as  AO  =  -sg  R*  .  sg  I  insures  that  the 
effective  residue  of  the  residue  retention  method  is  minimal 
despite  the  fact  that  a  pseudo  residue  actually  appears  in  the  R 
register.  The  actual  incremental  computation  of  the  R  register 
value  stated  above  does  not  involve  the  output  AO^,  but  only  the 
previous  output  resolving  the  "horse  before  the  cart"  implications 
described  earlier  in  this  description.  Economical  mechanisations 
without  rapid  access  memory  for  R  registers  are  obtainable  be¬ 
cause  of  this  fact.  The  basis  for  precise  numerical  incrementa¬ 
tion  formulae  for  more  general  computations  involving  division  are 
developed  in  the  chapter  on  general  numerical  incremental  com¬ 
putation.  When  variables  in  the  computation  process  are  fed  back, 
the  formal  relations  stated  imply  the  knowledge  of  outputs  before¬ 
hand  in  the  same  way  as  in  the  above  analysis.  The  formulae  must 
be  converted  to  practicable  form  in  the  same  manner  as  in  the  ex¬ 
ample.  The  formulae  derived  there  formally  apply  to  whole  numbers 
but  may  be  applied  to  any  given  approximate  number  representation 
with  the  usual  type  of  roundoff  error  introductions  which  in  principle 
can  be  reduced  somewhat  in  effect  by  special  processing  alterations. 

6. 4  TERNARY  QUOTIENT  ALGORITHM  -  The  principle  difference  in  binary 
and  ternary  quotient  algorithm  lies  in  the  overflow  criterion.  The  preceding 
description  shows  that  the  essence  of  overflow  criterion  is  the  selection  of 
that  one  output  of  the  set  of  allowed  outputs  for  the  communication  type  which 
minimises  the  formal  residue  (rather  than  the  actual  value  in  the  R  register) 
associated  with  the  principle  of  residue  retention.  Ternary  communication 
allows  three  outputs  ♦ 1 ,  -1,  0.  The  value  sero  is  the  best  choice  when  the 


♦Where  Ij,  is  the  algorithm  form  of  the  divisor  term  in  the  formal  residue 
incrementation  formula. 


VI- 10 


I  60  I 

absolute  value  of  the  R  register  is  less  than  —  |  >  |  y-|  since*  in  this 
case  an  output  of  absolute  value  unity  would  leave  the  formal  residue  greater 
in  absolute  value  than  before.  Thus  the  ternary  overflow  criterion  may  be 
stated  analytically  in  somewhat  more  general  form  as 

4o„  *  •« *«r  u  u  ir*i>  kit  i>  (vi-9) 

where  K  «  2k,  k  integral,  and  M  (f)  ■  1  for  f  true,  M( f)  »  0  for  f  false. 

The  special  device,  the  quotient  differential  proceesing  unit  (QDPU)  with 
ternary  output,  introduces  a  decision  process  markedly  different  from  that 
of  natural  overflow  of  an  R  register,  yet  one  capable  of  realisation.  Several 
teat  calculations  were  performed  by  hand  for  abort  runs,  hi  each  case  the 
result  was  correct  at  each  iteration  to  within  the  resolution  of  the  registers 
(1/2  bit),  and  ths  R  register  equivalent  had  RMS  deviation  from  null  con- 
aiatent  with  that  estimated  using  the  roundoff  theory  of  a  normal  R  register- 
From  the  analytical  standpoint  there  le  no  process  in  the  unit's  action  which 
appears  to  imply  more  roundoff  error  than  in  normal  R  register  action  (on 
the  contrary  in  the  sense  stated  below). 

The  problem  of  evaluating  the  implications  of  the  quotient  differential  pro¬ 
ceesing  unit  (QDPU)  type  design  in  aerospace  computers  has  been  investi¬ 
gated  in  several  stages  with  resulting  generalisation  of  the  QDPU.  The 
ultimate  value  of  a  unit  capable  of  incrementing  a  wider  variety  of  functions 
or  fonctionals  lies  in  ths  benefits  of: 

A.  Alternative  iacnsssf  itenathssv  iifoiaf/sr  computer  eesopatefoon 


•Where  Xg  is  the  algorithm  form  of  the  diviser  term  in  the  formal  residue 
incrementation  formula. 


VI- 11 


B.  Roundoff  error  reduction  through  reduction  of  the  number  of  R 
register  error  sources  involved  in  any  sub- routine  (as  well  as 
the  increased  resolution  and  decreased  algorithm  error  implied 
by  (A)  for  any  given  application). 

The  generalized  QDPU  overflow  rule  has  several  interesting  properties  which 
give  insight  into  the  nature  of  roundoff  error  associated  with  the  R  register 
for  binary  ami  ternary  action.  Regarding  the  contents  of  the  R  register  (which 
is  the  uncommunicated  residue  of  desired  output)  as  an  error,  the  conventional 
roundoff  analysis  implies  that  overall  performance  in  time  for  a  broad  ensem¬ 
ble  of  computations  is  measured  by  the  range  of  the  possible  R  register  values. 
The  generalized  QDPU  overflow  rule  has  an  overflow  parameter  which  with 
different  selections,  produces  overflow  of  disparate  types  including  binary, 
ternary,  blends  thereof  and  an  odd  variation.  From  an  arithmetic  operation 
standpoint,  the  overflow  of  R  register  from  conceptual  pre-overflow  state  is 
chosen  to  have  the  sign  of  the  arithmetic  value  represented  (except  for  a  divi¬ 
sion  action  where  the  sign  is  reversed  by  a  negative  divisor)»the  difference 
between  binary  and  ternary  action  being  that  ternary  calls  for  a  zero  output 
where  doing  so  reduces  the  resultant  error  (relative  to  not  doing  so).  From 
the  standpoint  of  the  generalized  overflow  rule,  the  overflow  parameter  de¬ 
termines  the  specific  situation  for  zero  output,  (for  binary,  never)  and  in 
general,  for  any  overflow  types  according  as  the  overflow  R  register  arith¬ 
metic  value  is  less  in  magnitude  than  the  overflow  parameter.  It  may  be 
shown  that  the  parameter  value  for  pure  ternary  implies  the  most  narrow 
range  of  R  register  values;  thus  ternary  is  optimum  for  a  statistical  overflow 
process  with  single  increment  output.  Let  it  be  granted  that  the  test  ensemble 
involves  the  occurrence  of  all  intermediate  values  between  the  natural  ex¬ 
tremes  since  such  can  be  the  case  for  a  sufficiently  small  T  register  quantity. 
For  a  hypothesized  R  register  value  slightly  less  in  absolute  value  than  the 


VI- 12 


overflow  parameter  (for  which  no  overflow  occurs),  the  extreme  error  evi- 
dently  is  at  least  as  large  as  K  (the  case  where  output  is  never  zero,  binary 
where  K  =  0,  being  trivially  included).  In  a  host  of  y  variable  situations,  in* 
eluding  the  case  where  y  is  very  small,  the  absolute  value  after  overflow  of 
±  1  is  ( 1  -  K+)  for  K+<  1.  The  overall  maximum  error  for  all  K  satisfies 

<  =  G(K~,  1  -  K+)*>  G  (K,  1  -  K)  (VI-10) 

where  G  (x,  y)  is  the  greater  of  x  and  y  values,  since  overflow  during  com¬ 
putation  following  the  start  of  the  R  register  at  an  intermediate  value  prevents 
leaving  the  range  assuming  |  y|  <  1.  The  value  of  K  which  minimizes  C  is 
readily  seen  to  be  K  *  1/2  for  which  C  ■  1/2.  Thie  is  the  ternary  case 
(variation  about  the  initial  setting  of  K  register  at  1/2  being  1/2  since  R 
extremes  are  0  and  1  ).  The  binary  case  is  equivalently  that  for  K  ■  0  for 
which  «  ■  1  showing  that  binary  has  twice  the  residue  range  of  ternary,  the 
latter  being  optimum  for  single  increment  overflow.  The  QOPU,  as  previously 
reported,  has  the  ternary  action.  In  the  division  mode,  the  roundoff  error 
of  the  output  is  effectively  increased  relative  to  the  actual  register  range 
in  proportion  to  the  reciprocal  of  the  scaled  divisor  register  quantity. 

6. 5  MECHANIZATION  FACTORS  IMPLYING  PARALLEL  COMPUTATION 
CAPABILITY  IN  A  RECIPROCAL  OR  QUOTIENT  ALGORITHM  COMPUTER  - 
The  mechanisation  relations  of  computers  with  parallel  computation  capability 
and  quotient  computation  capability  are  delineated  without  lose  of  implication 
by  considering  a  computer  which  has  the  most  elementary  capability  for  a 
division  process  in  one  word  time,  which  may  be  called  a  reciprocal  algo¬ 
rithm  computer. 

Figure  6-1  shows  the  minimal  register  array  enabling  single  increment  re¬ 
ciprocal  computation  in  one  word  time. 


VI-13 


Figure  4-1.  Register  Array  for  Division  In  On*  Word  Time 

Two  transfer  operation*  are  required  for  reciprocal  incrementation.  The 
cost  of  reciprocal  algorithm  is  somewhat  lese  than  that  of  a  conventional 
parallel  DDA  with  two  outputs  indicated  by  the  fact  that  there  is  one  instead 
of  two  R  registers.  Consider  performance  of  the  elementary  reciprocal 
algorithm  DDA  compared  to  the  parallel  DDA  for  several  different  types  of 
computations  found  in  important  aerospace  applications.  Typically  the 
number  of  divisions  called  for  is  small  (but  their  accuracy  has  overall  im¬ 
portance).  The  number  of  scaling  operations  is  considerable*  say  20  to  30 
percent  of  the  total  program  set.  The  quotient  algorithm  computer  can 
carry  out  a  whole  word  scaled  integration  in  on*  word  time,  which  as  in  the 
case  of  reciprocal  calculation  is  a°  speed  performance  equal  to  that  of  the 
parallel  DDA;  of  coure*  the  reciprocal  algorithm  enables  greatly  improved 
accuracy  in  reciprocal  calculation.  However,  this  effective  speed  perform¬ 
ance  is  obtained  during  only  about  35  percent  of  the  program  since  65  percent 
of  some  programs,  in  applications  such  as  navigation,  typically  consist  of 


VI*  14 


isolated  integrations  for  which  the  reciprocal  algorithm  computer  is  no 
faster  than  one  ODA  integrator.  During  these  computations  the  parallel  DDA 
costing  little  more  operates  twice  as  efficiently.  On  the  basis  of  these 
considerations  a  serial  reciprocal  or  quotient  algorithm  computer  is  relatively 
inefficient  in  certain  applications  such  as  navigation.  To  obtain  an  efficient 
mechanization  for  a  computer  capable  of  precision  reciprocal  calculation  the 
computer  should  also  be  capable  of  parallel  (2  output)  computation.  Such  a 
computer  is  obtained  by  making  transfer  operations  programmable.  Such  a 
computer  is  capable  of  45  to  50  percent  greater  speed  for  applications  in¬ 
volving  navigation. 

6.  6  INCREMENTAL  CALCULATIONS  ENABLED  BY  A  THREE- TRANSFERS - 
TO-R- REGISTER  MECHANIZATION  -  A  basic  processing  unit  with  two  trans¬ 
fers  per  iteration  can  be  designed  to  do  at  least  the  work  of  two  DDA  integrators 
in  all  cases  provided  it  is  capable  of  two  outputs.  A  single  output  unit  capable 
of  reciprocal  calculation  or  a  whole  word  scaled  integration  as  a  result  of  a 
computation  routine  structure  in  applications  such  as  navigation  is  found  to  be 
necessarily  allocated  a  large  portion  of  simple  integrations  for  which  the  DDA 
integrator  equivalent  is  only  one.  The  two  output  device  involves  eome  extra 
mechanisation  costs  associated  with  another  R- register,  and  the  requirement 
of  at  least  three  inputs,  and  serial  -  parallel  programmable  mode  capability. 
Three  or  four  transfers  per  Iteration  unit  with  higher  integrator  equivalents 
have  these  requirements  and  not  substantially  more,  apart  from  increased 
arithmetic  complexity.  Quotient  incrementation  of  Q  »  U/V  based  on  the 
relation 

«  .  <vi-ii> 

involves  design  for  thres  transfers  per  iteration.  Consider  the  more  general 
computation 


VI-15 


dQ  S 


pdA  +  gdB 


(VI-12) 


where  p,  q,  v  are  whole  word  variable*  of  the  form  (a  +  a*  A  +  a,B  +  a3p 

°  -K 

+  a4q  +  a*v)  and  a  (n*i)  and  are  expressible  in  the  form  2  n.  An  ordinary 
n 

DDA  requires  more  than  4  word  times  to  execute  (VI- 12).  The  form  of 
(VI-12)  includes  the  most  prevalent  semi-complicated  computation  forms 
making  up  such  computation  applications  as  inertial  navigation,  atmospheric 
re-entry,  and  missile  guidance.  Vector  computations  involving  scaling  or 
common  divisor  often  involve  this  form.  A  two  transfer  mechanisation  requires 
two  QDPU  to  perform  the  calculation.  The  three  or  four  transfer  mechsiniaa- 
tion  has  versatility  in  function  generation  which  is  very  impressive  for  the  one 
QOPU  case.  The  first  class  of  computations  requiring  only  one  QDPU  is  of 
the  form 

-K 

Where  are  whole  word  quantities*  0  are  of  the  form  2  n.  Thus,  for 

"  **  _  j  _  j  - 1  s 

various  choices  of  constants  QDPU  can  generate  x/y,  (x”  *  y~  )  .A 
second  class  of  computation  is 


x  +  a*  y  +  Si  xy  +  0,  x*  +  9*  ya 


(VI-14) 


For  various  choices  of  constants  th*  QDPU  can  generate  •Tx,  xy,  V x*+  y *, 
the  last  being  useful  for  polar  coordinate  transformation.  A  third  class  of 
computation  is  the  solution  of  a  quadratic  (the  root  being  determined  by  initial 
conditions), 

0  «  0*  +  I  («o  +  0j  x  +  8*y)  +  [ax  +  a*x  +  a*y  (VI- 15) 

+  0#  xy  +  0*x*  ♦  0*  y*  ] 

♦The  latter  function  used  heavily  in  statistics  for  variance  updating,  has 
potential  applications  in  adaptive  control. 


VI- 1 6 


Adaptive  missile  control  system  studies  of  missile  atmosphere 

re-entry  problems  relating  to  skin  temperature  involve  quadratic 

-K 

solution  computation.  A  fourth  class  of  computation  is  x 

a  It 

where  K  =  2  Related  to  this  capability  is  that  of  generating 
a  cubic  function  with  whole  word  coefficients  (output  scale  factor, 
not  whole  word),  such  as  the  aerodynamic  fit  functions  used  in 
craft  control  in  re-entry.  The  inverse  capability  of  solving  a 
cubic  is  possible  for  a  certain  class  with  one  variable  coefficient. 
Any  of  these  operations  can  be  updated  in  one  word  time. 

6.  7  FURTHER  DIGITAL  PROCESSING  UNIT  FUNCTIONAL  STUDIES 
BASED  ON  PROPERTIES  OF  ELEMENTARY  COMPUTATIONS  PREVALENT 
IN  AIRBORNE  AND  AEROSPACE  APPLICATIONS  -  Preliminary  investiga¬ 
tions  of  QDPU*  capabilities  hypothesizing  alternative  closely  related  designs 
were  carried  out  in  order  to  deduce  heuristically  a  notion  of  the  natural  or 
basic  computation  unit  implied  as  a  result  of  application  computation  struc¬ 
tures  (as  deduced  from  application  surveys)  and  relative  complexities  of  a 
range  of  mechanization  structures. 

A  possible  definition  for  a  basic  operation  to  be  executed  by  the  QDPU,  is 
any  operation  within  the  broad  set  of  computation  applications  in  sufficient 
abundance  that  a  real  total  saving  in  computation  time  and  gain  in  accuracy 
would  result,  if  that  operation  could  be  consistently  executed  in  one  word 
time  (or  effectively  one-half  word  time)  by  hardware  of  acceptable  cost 
capable  of  likewise  doing  so  for  all  other  basic  operations.  The  key  to 
efficient  design  is  a  generalised  basic  processing  unit  capable  of  executing 


^Quotient  Differential  Processing  Unit. 


VI- 17 


relatively  complicated  operations  in  one  word  time,  be  able  to  execute  the 
major  portion  of  its  operations,  which  are  the  simplest  operations  (princi¬ 
pally  integration  or  next  in  simplicity,  whole  word  scaled  integration),  at 
more  than  conventional  word  time  rates.  The  unit  of  more  than  minimal 
complexity  and  more  than  minimal  computation  capability  must  have  parallel 
independent  integrations  capability  to  be  efficient  in  terms  of  duty  factor. 
Accepting  parallel  mode  essentiality  in  the  QDPU,  a  further  and  very  im¬ 
portant  basic  operation  of  parallel  processing  type  was  shown  to  exist  as 
determined  in  a  survey  of  aerospace  application  computations.  One  of  the 
most  frequent  computations  in  aerospace  programs  is  vector  resolution  and 
rotation  in  two  and  three  dimensions.  A  basic  operation  associated  with 
these  calculations  is  the  two  dimensional  resolution.  Given  a  variable  A, 
compute  the  projection  of  A  on  axis  x  and  y,  given  C^,  C  direction  cosines, 
namely. 


A  C 


A 

y 


A  C 

y 


(VI-  lb) 


A  serial  DDA  requires  four  integrators  to  update  A  ,  A  .  Note  that  only 

x  y 

three  whole  word  variables  are  involved  just  as  in  the  other  QDPU  basic 
modes  previously  reported.  The  OOPU  capable  of  executing  the  incrementa¬ 
tions  in  parallel 


dA  «  C  dA  +  AdC 

XX  X 

dA  =  C  dA  +  AdC 

y  y  y 


(VI- 17) 


*  Here  operating  rate  is  alluded  to.  Actually,  precision  level  increases  re¬ 
sulting  from  reduced  roundoff  error  in  themselves  justify  design  of  unit  with 
more  complex  basic  operation  associated  with  only  one  R  register  (round  off 
error  source). 

**  See  Chapter  Ill 


VI-18 


in  one  word  time  is  not  significantly  more  complex  than  the  less  versatile 
QDPU  with  the  same  input  capability  and  register  count  but  only  three  trans¬ 
fer  units.  To  see  the  application  of  the  generalized  QDPU  in  three  dimen¬ 
sional  problems,  consider  first  the  polar  to  rectangular  coordinate  calculation 

x  =  r  cos  0  cos  x 

y  =  r  cos  0  sin  x  VI- 18) 

z  *  r  sin  0 

given  sin  0,  cos  0,  sin  x,  cos  x  and  r.  In  this  example,  the  generalized 
QDPU  updates  x,  y,  z  with  100  percent  efficiency  as  indicated  in  Figure  6-2. 


Figure  6-2.  QDPU  Program  Diagram  for  Updating  x, 
y,  and  z  with  100  Percent  Efficiency 


VI- 19 


This  mechanization  accomplishes  in  two  word  times  the  computations  tor 
which  previous  mechanizations  required  from  4  to  8  word  times.  An 
example  of  the  proposed  QDDA  speed  performance,  relative  to  other  existing 
DDA  computers,  is  a  t-pical  program  for  a  toss  bombing  problem  chosen  to 
illustrate  the  capability  of  an  earlier  machine.  The  example  is  •••-•peciall 
favorable  to  the  earlier  machine  because  few  isolated  integrations  are  called 
for  as  in  the  more  important  case  of  navigation  computations  for  which  the 
QDDA  however  maintains  the  same  efficiency  (see  Chapter  X).  The  QDDA 
program  is  diagramed  in  Figure  6-3.  Two  more  conventional  serial  DDA 
performances  are  summarized  for  the  following  computation  subroutine: 


A  PORTION  OF  THE  PROGRAM  DIAGRAM 
FOR  THE  TOSS  BOMBING  PROBLEM 


|  g  T*  +  H  cos  (A  -  £ )  -  ®r  (sin  (A  -  £  ) 


D  cos  A  +  fj  g  t*  +  H )  sin  (A  -  £  ) 
V  v  u»r  cos  (A  -  J  ) 


(Vl-19) 


INPUTS:  V,  »r,  £.  D,  H 

CALCULATE:  sin  A 


VI- 20 


d  (DC  +  xs)  dr 


Vt-ll 


SPEED  PERFORMANCE 


SPEED  FACTOR 


QDDA  -  7  words/ltCration  {diagram}-  200% 

(Inertial  QDDA  Operation  Structure) 

Variable  Increment  DOA  with  Quotient  Algorithm  i00% 

14  words/iteration 

Conventional  Serial  DDA  -  29  words/iteration  (estimated)  48% 

(This  does  not  reflect  equivalent  speed  increase  due  to  the  unique  QDDA 
higher  order  ternary  algorithm,  but  only  hardware  iteration  rate. ) 

Relative  accuracy  performance  is  effected  by  the  iteration  rate  as  well  as 
integration  algorithm.  The  later  developed  QDDA  multi -increment  compu¬ 
tation  and  also  higher  order  integration  algorithm  of  QDPU  described  in  the 
chapter  on  QDDA  algorithm  and  round-off  operation  raises  equivalent  itera¬ 
tion  rate  to  an  additional  level  higher  than  indicated  above  for  existing  in¬ 
cremental  computers.  In  terms  of  hardware  iteration  rate  alone  it  is 
noted  in  the  example  program  that  the  QDDA  with  stated  operation  features  pre¬ 
sents  almost  the  same  step  over  the  variable  increment  DOA  that  the  latter 
computer  presents  over  conventional  serial  DDA.  In  navigation  computation  the 
step  forward  is  et  greater.  It  is  important  to  evaluate  the  relation  of  an  one 
example  program  tothe  larger  set  of  application  programs,  especiall  for  tie  aero¬ 
space  case  explored  in  considerable  detail  in  Chapter  XI.  A  •>  previously  reported, 
certain  important  applications,  most  notably  inertial  navigation,  imply 
somewhat  different  relative  performances  of  the  various  computer  types  as 
a  result  of  more  frequent  occurrence  in  their  routines  of  certain  basic 
operations.  In  the  extreme  case  (no  pure  example  of  which  is  believed 
likely  to  be  found4)  where  pure  non-additive  integrations  in  the  routine 

♦Nor  at  all  in  the  ODDA  as  seen  later,  since  input  processing  may  use  a 
double  precision  mode  with  100%  duty  factor. 


VI-22 


predominate  over  all  others,  the  variable  increment  DDA  because  of  a  lack  of 
parallel  computation  ability,  would  perform  onl-  20  percent  faster  (rather  than 
108  percent  faster,  as  in  the  previous  example)  than  a  serial  DDA,  the  20  per¬ 
cent  gain  being  primarily  ascribed  to  its  whole -word  scaling  capability  .  In  this 
extreme  (conceptual)  case  the  QDDAoperation  as  stated  in  this  first  analysis 
would  perform  140  percent  faster  (in  later  analysis  of  multi-increment  QDDA 
the  performance  is  180  percent  faster*  for  this  extreme  conceptual  case) 
than  a  serial  DDA  (instead  of  316  percent  faster  as  in  the  previous  example) 
again  considering  iteration  rate  alone.  At  the  other  end  of  the  spectrum  of 
application  computations  in  which  coordinate,  vector,  and  tensor  transforma¬ 
tion  routines  predt  /ninate,  the  QDDA,  because  of  the  resolver  mode,  has  four 
times  the  speed  of  a  serial  DDA  (and  still  about  twice  the  speed  of  the  variable 
increment  DDA.)  On  these  bases,  the  toss  bombing  program  described  above, 
is  considered  to  give  a  fairly  accuract  (or  slightly  conservative^  indication  of 
QDDA  pure  program  speed  performance  on  the  basis  of  stated  operation 
features  relative  to  previously  designed  variable  increment  DDA  performance. 

It  ia  also  important  to  gauge  performances  relative  to  serial  DDA,  especially 
for  the  inertial  navigation  application.  An  airborne  navigation  system  de¬ 
veloped  at  Litton  Industries  and  designed  for  minimum  overall  complexity 
was  chosen  for  the  preliminary  evaluation  (the  program  length  is  probably 
only  25  percent  of  the  length  required  for  a  full  aerospace  mission).  A 
QDDA  program  was  derived  for  minimal  complexity  inertial  navigation 
which  involves  only  16  QDPU  and  (with  the  conservative  200  kc  clock  rate) 
could  achieve  an  iteration  rate  of  as  high  as  625  iter/sec.  A  serial  DDA 
requires  54  integrators  and  at  best,  for  this  clock  rate,  would  have  an  iter¬ 
ation  rate  of  200  iter/sec.  Had  the  DDA  been  allocated  such  calculations  as 
arc  sin,  etc. ,  instead  of  the  GP,  the  comparison  would  be  much  improved 

*But  with  programmable  double  precision. 


VI-2  J 


for  QDDA.  The  general  results  of  these  preliminary  studies  were  con¬ 
sidered  sufficiently  encouraging  to  make  more  detailed  studies  in  the  direc¬ 
tion  of  developing  a  computer  for  a  full  aerospace  mission. 

6.  8  FURTHER  STUDIES  OF  PROGRAMMABLE  TRANSFER  FEATURES 
AND  REGISTER  REQUIREMENTS  FOR  A  BASIC  DIFFERENTIAL  PROC¬ 
ESSING  UNIT  CAPABLE  OF  DIVISION  WHICH  HAS  MAXIMUM  DUTY 
FACTOR  -  Four  major  factors  imply  the  DDA  design  approach  of  developing 
a  differential  processing  unit  consisting  of  a  combined  integrator  ensemble  with 
serial- parallel  processing  features; 

1.  Precision  division  computation  is  required  for  aerospace  applications 
(for  missiles  of  near  orbital  velocity  and  wide  altitude  range),  imply¬ 
ing  several  transfer  actions  within  the  basic  unit  cycle. 

2.  Parallel  transfer  capability  without  parallel  output  capabilities,  the 
latter  costing  little  more,  limits  processing  rate  for  applications  such 
as  navigation. 

3.  Demand  for  a  high  computation  capability,  as  a  result  of  the  large  aero¬ 
space  computation  programs  and  high  precision  requirements. 

4.  The  full  scale  computer  system  with  simplest  mechanization  that  is 
capable  of  input  processing  and  internal  computation  functions  has  a 
single  arithmetic  module  which  is  time  shared  for  these  functions. 

The  minimum  level  of  complexity  for  the  input  processing  implies 
overall  low  cost  mechanisation  of  the  combined  integrator  ensemble  with 
serial  parallel  processing  features  of  complexity  up  to  that  required 

for  input  processing. 

One  useful  measure  of  the  efficiency  of  a  differential  processing  unit  is  the 
duty  factor  i.  e. ,  percent  of  full  time  employment,  during  execution  of  a 
computation  program,  of  available  arithmetic  capability  toward  a  useful 


VI-24 


end.  It  will  be  assumed  that  all  mechanization  features  which  do  not  limit 
processing  versatility  or  duty  factor  are  optimized.  The  appropriateness 
of  a  differential  processing  unit,  with  a  high  duty  factor  in  a  DDA  system 
tailored  to  the  application,  is  the  degree  of  match  between  the  computation 
capability  available  and  that  demanded  by  the  application,  since  realizing  a 
high  duty  factor  as  the  assumed  design  goal  implies  minimum  mechanization 
cost  for  the  computation  capacity  obtained.  It  was  decided  that  quotient 
algorithm  is  required  for  the  aerospace  application,  and,  associated  with 
these  requirements,  parallel  computation  design  for  high  duty  factor  in  the 
broad  set  of  computations.  A  quotient  differential  processing  unit,  QDPU, 
should  be  capable  of  two  outputs  and  probably  not  more.  Restricting  con* 
sideration  to  two  output  devices,  the  problem  of  attaining  high  duty  factor 
with  the  simplest  mechanization  involves  investigation  of  such  features  as 
the  number  of  input  variables  (each  input  variable  being  the  sum  of  several 
outputs  of  a  QDPU  or  external  input)  involved  in  transfer  operations  and,  in 
addition,  the  number  of  y  registers.  The  number  of  transfer  units  determines 
an  upper  limit  on  computation  capacity  which  must  match  the  application. 
Studies  in  single  increment  QDPU's  consider  two  to  four  transfer  designs, 
two  being  the  minimum  for  precision  reciprocal  computation.  The  number 
of  input  variables  involved  in  transfer  operations  necessary  to  enable  high 
duty  factor  depends  on  the  nature  of  program  computations.  Airborne  and 
aerospace  calculations  typically  involve  vector  and  coordinate  transforma¬ 
tions.  which  are  composed  of  elementary  operations,  pairs  of  triples  of 
which  have  a  common  variable.  It  will  be  shown  that  provisions  for  three 
input  variables,  each  of  which  may  be  the  sum  of  several  outputs,  is  gen¬ 
erally  necessary  and  adequate  to  exploit  the  arithmetic  capability  of  a  QDPU 
designed  to  execute  up  to  four  transfer  operations.  There  is  minor  mechani¬ 
zation  cost  in  providing  for  selecting  the  communication  of  four  inputs  (as¬ 
suming  the  same  total  of  component  inputs  to  the  QDPU)  in  the  case  of  rapid’ 


VI- 2$ 


access  stored  outputs,  but  the  additional  cost  for  component  variables  sum¬ 
mation,  •  juantization ,  storage  for  residue  retention,  and  the  programmable 
selection  for  transfer  operation,  are  significant.  The  fact  that  three  input 
variables  enable  high  duty  factor  for  a  four  transfer  QDPU,  as  well  as  for  a 
two  transfer  QDPU,  is  favorable  for  the  more  powerful  associated  computer. 
Another  closely  related  consequence  of  the  type  of  computations  for  airborne 
and  aerospace  applications  is  that  both  three  and  four  transfer  QDPU's  can 
have  high  duty  factor  with  a  mechanization  with  three  y  registers.  Of 
course,  the  two  transfer  QDPU  must  have  two  y  registers  for  reciprocal 
calculation,  and  cannot  use  more  than  two  y  registers  with  two  transfer 
capability.  Component  inputs  forming  an  input  variable  must  be  <)uanti.*<d 
with  residue  retention  registers  for  good  accuracy.  These  registers  are 
referred  to  as  6  registers.  For  single  increment  computation  they  must  be 
four  to  seven  bits  in  length  for  flexible  scaling,  wheieas  lor  multi -increment  tl 
computation  must  be  s.even  to  ten  bits  in  length.  Therefore, the\  are  gem-rail 
cheaper  in  storage  costs  than  y  and  R  registers  unless  stored  in  separate 
drum  channels,  one  to  a  word  time.  In  this  case  considerable  storage  space 
is  either  wasted  or  available  for  other  purposes.  The  extra  space  can  be 
used  for  inputs  address,  as  in  the  drum  storage  case  the  6  registers  do  not 
basically  have  expensive  storage.  Multi -input  quanti.'at  .on  operations  in 
generating  6  registers  residues  and  •; uanti zed  \anables  can  be  rm chan. zed 
economically  using  time  shared  arithmetic  subunits. 

6.  <»  MECHANIZATION  SIMPLIFICATIONS  RELATIVE  CONVENTIONAL 
PARALLEL  DDA  WHICH  EFFECT  TIMF  SHARING  WITHOUT  RATE  LOSS 
IN  MULTI-REGISTER  SYSTEMS  WITH  MANY  TRANSFERS  PER  WORD 
TIME  -  The  increased  computation  capacity  of  systems  with  many  transfers 
per  word  time  has  been  considered  by  many  to  be  proportionately  paid  for 
by  increased  mechanisation  requirements.  This  concept  is  reflected  in  the 
eventual  integration  of  internal  computation  with  input  processing  the  latter 


Vl-2fc 


alone  requiring  a  set  complexity  level.  It  should  also  be  independently  con¬ 
cluded  that  a  DDA  with  many  transfers  per  word  time  without  time  shared 
input  processing  may  have  surprisingly  efficient  mechanization  by  incor¬ 
porating  certain  time  sharing  features  which  do  not  imply  iteration  rate  loss. 
The  element  of  mechanization  cost  in  proportion  to  the  number  of  transfers 
effected  per  word  time  basically  required  by  application  and  bit  rate  is  only 
one  part  of  overall  cost,  and  certainly  if  it  were  the  only  added  cost  for 
several  transfer  capability  (single  bit),  would  make  the  overall  cost  of  the 
more  powerful  computer  only  fractionally  somewhat  more  costly  than  a  con¬ 
ventional  programmable  ODA.  The  particular  mechanism  chosen  for  com¬ 
munication  to  the  QDPU  is  simpler  than  conventional  DDA,  of  large  program 
capacity  because  output  number  is  reduced  by  one  half  in  the  QDPU  relative 
to  conventional  DDA. 

The  costs  of  providing  input  selection  of  accessible  outputs  and  multi*  input 
quantization  per  word  time  mechanization  with  modal  versatility  would 
appear  on  first  inspection  to  imply  proportionate  cost  with  computation 
capability.  Closer  analysis  reveals  that  these  costs  are  not  nearly  of  such 
degree,  provided  there  is  full  exploitation  in  design  of  basic  input  operation 
characteristics.  Adequate  scaling  flexibility  is  provided  by  multi-input 
registers  which  are  less  than  one-half  to  one-fourth  the  length  of  the  y  and 
R  registers.  This  implies  that  a  single  arithmetic  mechanism  for  the  multi¬ 
input  quantization  operation  can,  by  time  sharing,  produce  quantized  inputs 
of  two  to  four  in  number,  thereby  costing  little  more  in  flip-flops  than  in  the 
single  transfer  DDA,  provided  in  the  case  a  few  words  of  rather  cheap  core 
storage  is  available.  A  significant  saving  results  in  channel  count  using 
drum  storage  results. 


VI-27 


Consider  the  y  and  Ax  allocation  for  several-input,  several-transfer 
mechanizations  with  a  high  level  of  transfer  versatility  obtained  by 
programmable  mode  design.  Programmable  Ax  allocation  to  transfer 
units  for  y  variables  is  effected  by  a  program  bit  storage  in  flip-flops 
during  the  word  time  in  the  four  transfer  per  word  time  QDDA.  The 
allocation  would  appear  at  first  count  to  require  eight  flip-flops  for 
arithmetic  operation)  ten  flip-flops  for  channel  storage,  and  up  to  ten 
flip-flops  for  transfer  modality.  However,  using  a  time  shared  multi¬ 
input  quantisation  register  and  only  two  channels  for  Ax  data  storage, 
the  inputs  designated  by  a  seyen  bit  address  are  selected  from  a  core 
memory,  the  quantisation  register  contents  after  computation  com¬ 
pletion,  (l/2  to  l/4  word  time),  may  be  put,  after  word  split,  to  form 
residue  which  is  restored  and  transfer  bit  (or  bits)  used  in  arithmetic 
operation  of  that  QDPU.  It  is  estimated  that  application  of  this  design 
technique  for  a  four  transfer  unit  reduces  flip-flop  requirements  relative  to 
the  proportionate  hardware  approach  from  28  flip-flops  to  five  flip- 
flops  and  20  cores  (the  latter  being  relatively  cheap). 

6.  10  HEURISTIC  COMMENTARY  ON  COMPUTER  DESIGN  -  As 
stated  by  Claude  Shannon*  in  a  heuristic  analysis  of  desirable  com¬ 
puter  attributes,  the  human  brain  is  a  suitable  computing  system 
supplied  by  nature  that,  in  our  state  of  advancement  in  the  computing 
field,  offers  much  for  enlightenment  in  computer  design.  While  most 
facts  are  unknown  regarding  brain  operation,  he  states  it  is  clear  that 
the  brain  has  extensive  parallel  operation  features.  In  reflecting  on 
this  advice  and  statement  of  fact,  the  comments  of  other  authorities 
in  the  field  of  psychology  are  believed  pertinent.  While  undoubt¬ 
edly  neuron  action  has  mega-pa rallelity,  the  flows  of  what 


♦Litton  Consultant 


VI- 28 


may  be  said  to  constitute  individual  thoughts  may  not  be  in  such 
degree.  Indeed,  psychologists  state  that  the  healthy  brain  can 
carry  no  more  than  three  simultaneous  continuous  thought 
processes,  and  most  healthy  brains  (of  equally  intelligent  indi¬ 
viduals)  can  carry  only  two.  It  is  perhaps  a  surprising  coincidence 
that  the  basic  operations,  which  most  frequently  occur  in  computa¬ 
tions,  have  a  level  of  complexity  and  interoperation  correlation 
that  leads  to  greatest  efficiency  in  mechanisation  with  a  comparable 
degree  of  parallellty  found  in  human  thought  processes.  The  QDDA 
mechanization  developed  exhibits  a  close  parallelism  to  the  human 
brain.  Each  operation  element  (one  word  time  of  computation) 
possesses  the  ability  to  carry  on  two  simple  and  parallel  opera¬ 
tions  simultaneously  or,  alternatively,  to  execute  a  single  complex 
operation,  thus  utilising  its  computation  capabilities  maximally  at 
all  times. 


VI- *9 


CHAPTER  VII 


NEW  CONCEPTS  OF  MULTI-INCREMENT  COMPUTATION  AND 
DEVELOPMENT  OF  A  MULTI- INCREMENT  GENERAL  (QUOTIENT) 
ALGORITHM  COMPUTER  WITH  SECOND  DIFFERENCE 
OUTPUTS  HAVING  SIMPLIFIED  COMMUNICATION 

7.  0  INTRODUCTION  -  The  need  for  multi- increment  computation*  in  cer¬ 
tain  portions  of  airborne  and  aerospace  computation  programs  has  been 
established  in  extensive  studies  during  the  program.  In  Phase  I  the  concept 
of  using  different  computation  techniques  on  routines  with  markedly  different 
computation  requirements  was  given  form  in  the  classification  of  input  proc¬ 
essing  and  internal  computation  types.  In  Phase  II  it  was  established  that 
aerospace  computations  require  division  capability  which  presents  another 
type  of  computation  requirement.  Internal  computations  were  classified  as 
those  with  the  more  readily  met  computation  requirements,  which  by  the 
nature  of  application  programs  tend  to  constitute  the  major  portion  of  pro¬ 
gram  operations  count.  The  goal  of  developing  a  full  scale  programmable 
computer  capable  of  executing  all  types  of  computations,  a  computer  which 
has  moderate  mechanization  requirements,  was  seen  to  imply  the  design  ap¬ 
proach  of  modal  operation  with  extensive  time  rhuring  of  arithmetic  and 
communication  hardware.  Time  sharing  implies  some  price  in  program 
capacity  for  the  internal  computations  in  particular.  Therefore  an  in¬ 
creased  computation  sophistication  is  implied  for  internal  computations  in 
a  computer  using  time  sharing  (for  mechanization  simplicity)  relative  to  one 
without  time  sharing.  Since  input  processing  requires  multi -increment 
computation  implying  the  presence  in  the  computer  of  a  many  bit  transfer 
mechanism,  the  possible  best  selection  of  several  bit  increment  computa¬ 
tion  for  internal  computation  was  considered  emminent.  The  integration  of 

♦Or  single  increment  computation  at  ultra  high  iteration  rate  which  implies 
corresponding  hardware  costs  with  state  of  the  art  hardware. 


VIM 


input  processing  into  the  computer  system  in  an  efficient  manner  also  implied 
that  data  storage  and  communication  channels  be  similar  or  analogous  to  that 
of  a  conventional  ODA.  Therefore  investigations  were  launched  in  the  field  of 
multi- increment  DDA  design  in  the  direction  of  several  bit  increment  compu¬ 
tation  (as  well  as  the  whole  word  increment  computation  demonstrated  in  the 
strap-down  processor).  Precision  quotient  algorithm  was  considered  essen¬ 
tial  for  the  final  computer  developed. 

7.  1  LIMITATIONS  OF  STATE  OF  THE  ART  COMPUTER  DESIGN 

TECHNIQUES  IN  MULTI-INCREMENT,  VARIABLE  INCREMENT 
AND  QUOTIENT  COMPUTATION  - 

A.  General  Design  Factors  -  The  historical  motivation  for  single  or 
few  bit  rather  than  many  bit  increment  computer  design  clearly 
stems  from  the  desire  for  simplified  computer  mechanisation 
associated  with  coste  for  communication  and  multi-transfer 
(multiplication).  Generally  the  lese  the  increment  sise  the  less 
the  attainable  precision  so  that  die  application  determines  the 
tolerable  minimum  increment  bit  length.  It  will  be  shown  later 
in  this  chapter  that  the  general  impreeeion  regarding  inherently 
increaeed  communication  coste  in  a  multi- increment  computer 
ie  incorrect. 

B.  Variable  Increment  Computation  Limitations  -  Variable 
increment  computation  techniques,  employ  single  transfer 
and  call  for,  at  any  time  during  operation,  effective  communi¬ 
cation  of  a  single  binary  increment  of  variable  scale;  the  scale 

K 

of  the  output  is  2  ,  K  integral  is  communicated  as  a  several 
bit  word  which  changes  as  the  required  output  rate  dictates. 

A  single  transfer  computer  in  Stieltjes  integration  opera¬ 
tions  or  product  computations  cannot  approach  the  accuracy 
of  a  multi -transfer  computer  whether  variable  or  constant  scale 


vn-2 


communication  is  used  or  not.  In  variable  increment  computation, 
when  output  rates  lay  in  a  general  range  of  magnitude,  the  com¬ 
munications  are  single  bit  magnitude  with  a  constant  scale,  and 
can  have  no  greater  accuracy  than  a  single  increment  computer 
during  that  period.  The  overall  performance  of  a  variable  incre¬ 
ment  computation  is  expected  to  be  somewhat  superior  to  single 
increment  computation  provided  the  periods  of  maximum  rate 
change  of  outputs  is  short.  The  rate  handling  capability,  in  a 
sense,  is  the  major  asset  of  the  variable  increment  computation. 
Multi-increment  computation  generally  provides  both  rate  handling 
and  precision  capability,  the  latter  being  much  lower  in  variable 
increment.  The  communication  requirements  of  variable  incre¬ 
ment  present  a  significant  additional  cost  relative  to  single  incre¬ 
ment  DDA.  Those  for  variable  increment  are  comparable  to  the 
costs  for  direct  >  4-bit  multi-increment  communication.  This 
study  shows  how  the  superior  multi- increment  computation  may  be 
mechanised  in  a  form  using  single  increment  communication  which  is 
markedly  cheaper  than  that  of  existing  variable  increment  computers. 

C.  Mul.i-Increment  Computation  with  Ouotient  Algorithm  -  The 
desirable  feature  of  quotient  algorithm  appeared  to  present  a 
stumbling  block  in  the  development  of  multi- increment  computa¬ 
tion.  It  is  deduced  that  the  development  of  variable  increment 
was  an  effort  to  walk  around  rather  than  remove  the  apparent 
stumbling  block.  Apart  from  lack  of  design  techniques,  the  major 
problems  in  multi-increment  design  arise  from  the  cost  of  com¬ 
munication  and  multi -transfer.  The  development  during  this 
study  of  design  techniques  to  accomplish  multi- increment  quotient 
algo  rithm  computation,  and  further  to  hold  communication  costs 
to  those  comparable  to  single  increment  computation  (and  less 


VII- 3 


than  variable  increment),  eliminate*  two  problems.  The  cost  of 
multi-transfer  piechanization  for  internal  computation  in  a  com¬ 
puter,  capable  of  input  processing,  is  shown  to  be  minimal  in  a 
time  sharing  design. 

D.  Initial  Problems  in  Developing  Design  Techniques  for  Multi- 

Increment  Computation  with  Division  Algorithm  -  The  technical 
design  problem  of  multi- increment  computation  with  division 
algorithm  in  a  computer,  in  which  only  transfer  operations  are 
executed,  stems  from  the  nature  of  output  generation  from  the 
DDA  integrator  (or  generalised  differential  processing  unit). 
Conventional  single  increment  DDA  integrator  outputs  are  gen¬ 
erated  by  natural  overflow,  i.  e.  the  propagation  of  a  carry  from 
the  most  significant  bit  position  of  the  integrator.  Previous 
analyses'*  in  multi-increment  computation  without  division  algo¬ 
rithm,  showed  that  natural  overflow  is  not  appropriate  in  multi¬ 
increment  computation.  The  output  should  be  generated  by  one 
of  the  alternative  round  off  techniques.  The  design  approach  is 
based  on  the  preservation  of  the  relationship  of  pro-  and  post¬ 
overflow  difference  equaling  output  magnitude  as  is  normally 
attained  by  natural  overflow.  The  development  of  more  sophisti¬ 
cated  algorithms  is  expedited  by  analyzing  output  criterion*  and 
R  register  adjustments  for  outputs.  The  fundamental  difference 
in  single  increment  computation  be*  veen  (1)  incrementing  a  vari¬ 
able  and  (2)  incrementing  a  variable  and  dividing  by  a  whole  word 
variable,  in  an  incremental  computer  capable  only  of  transfer 
operations,  may  be  described  in  terms  of  constant  unit  and 


*Analysis  by  J.  Campeau  at  Litton  Industries  in  1957. 


VII  -4 


variable  whole  word  icale  factors.  The  output  criterion  for 

(1)  might  be  described  abstractly  as  calling  for  an  output  if  the 

subtraction  or  addition  of  unit  (scale  relative  to  full  register 

value)  would,  in  the  case  of  binary  or  in  ternary,  reduce  the 

absolute  value  of  the  number  in  the  R  register  (relative  to  the 

original  value  for  ternary).  For  (2),  the  output  criterion  calls 

for  an  output  if  the  subtraction  or  addition  of  the  whole  word 

+K 

(divisor)  variable  (or  2  times  it  if  desired)  would  reduce  the 

abeolute  value  of  the  number  left  in  the  R  register;  the  output 

for  (2),  if  non-zero,  is  ±1  in  scale  chosen  for  the  output  which 

may  be  choeen  2  time  s  that  of  another  choice  without  making 

the  output  determination  in  «.  binary  computer  more  difficult 

(effected  by  relative  delay).  Variable  increment  computation 

can  be  obtained  by  mechanizing  choice  of  scale  2  for  comparison 

(according  to  variable  rate,  if  desired  for  maximum  rate  handling). 

Mechanization  of  output  criterion  for  binary  merely  requires 

using  the  sign  of  the  R  register  and  the  divieor.  In  ternary,  the 

mechanization  is  the  same  except  for  a  modification  to  have  zero 

output  if  the  R  register  contains  less  than  half  the  divisor,  as 

determined  by  two  parallel  computations  of  sum  and  difference. 

In  principle,  the  output  criterion  for  the  multi- Increment  case 

can  be  determined  by  many  parallel  test  computations  for  each 

poesible  alternative  of  output,  with  the  number  of  test  computa- 
M+ 1 

tions  being  >  (2  )  for  M  bit  increment.  Evidently  the  number 

of  adders  required  to  perform  such  a  direct  test  approach  is  un¬ 
acceptable.  An  indirect  design  approach  which  leads  to  relatively 
simple  mechanization  is  developed  in  a  later  section.  The  eec- 
ond  problem,  which  is  correcting  R  register  for  each  output 
(eince  overflow  ie  not  natural),  appears  in  the  first  quotient 


VII- 5 


algorithm  computer  in  1954  where  it  is  deduced  rapid  access 
registers  were  used  for  correction  for  (a  posteriori)  deduced 
outputs.  The  quotient  algorithms  developed  here  are  amenable  to 
drum  register  storage  for  economy,  since  deduced  outputs  may 
be  fedback  at  the  next  iteration  to  correct  the  R  register  during 
the  next  operation  simultaneous  with  the  normal  operations  of 
that  cycle. 

7.  2  FEASIBILITY  OF  SINGLE  INCREMENT  COMMUNICATION  FOR  A 
MULTI-INCREMENT  DDA  COMPUTING  BAND  LIMITED  VARIABLES  -  The 
prevailing  impression  exists  in  the  computer  field  that  a  multi-increment  DDA 
must  have  multi-increment  communication  in  order  to  have  multi-increment 
accuracy.  Certainly  it  is  true  that  the  integrator  or  generalised  basic  proc¬ 
essing  unit  must  have  the  information  regarding  multi- increment  changes  in 
order  to  have  the  precision.  The  question  .then,  is  really  whether  the  informa¬ 
tion  is  available  without  direct  multi- increment  communication.  A  well  known 
principle  of  information  theory  is  that  information  transmission  rate  may  be 
low  for  band  limited  variables  as  compared  to  that  for  high  frequency  vari¬ 
ables.  Thus,  if  the  internal  computations  (which  are  generally  lower  fre¬ 
quency  than  those  input  processing)  have  sufficiently  low  frequency  relative 
iteration  rate,  then  some  level  of  multi- increment  accuracy  may  be  possible 
using  single  increment  communication  of  the  right  kind  ,  Since  DDA  compu¬ 
tations  are  generally  considered  based  on  high  degree  of  analyticity  (apart 
from  interruptions  of  analyticity,  which  can  often  be  handled  using  decision 
modes  or,  if  necessary,  GP  supervision)  it  is  expected  that  most  variables 
in  a  DDA  program  are  band  limited.  The  most  elementary  condition  for 
simplified  communication  is  that  the  variables,  represented  by  the  communi¬ 
cated  increments,  change  not  more  than  a  certain  amount  per  iteration  i.  e. 
for  properly  scaled  variables  reprssented  by  M  bit  increment  that  do 


VII -6 


not  change  by  more  than  2'**  fraction  in  one  iteration.  Certainly  if  M  ii 
sufficiently  small  the  physical  properties  of  the  variable  permit  this  assump¬ 
tion,  just  as  the  scaling  assumed  on  a  physical  basis  holds  for  the  total  in¬ 
crement.  Consider  a  sinusoid  of  frequency  f  being  computed  at  iteration 
interval  r  with  multi -increment  accuracy-  The  increment  may  have  the 
form 

Ax  *  A  sin  (2vfrn+0)  (V1I-1) 

The  change  in  Ax  per  iteration  is 

A-x  ~2r  fr  A  cos  (2fffrn+0)  (VH-2) 


The  maximum  values  are  Ax  *  A  and 

max 

(A8x> 

A  x  =  2vfrA,  hence  we  have  ,v  v .  *  2ffr  (VII-3) 

max  (Ax)  »  ■*' 

max 

Most  variables  in  aerospace  computations  are  significantly  less  than  0. 1  cpe 
hence  a  computer  at  150  iter/sec  with  single  increment  communication  and 
having  many  bit  transfer  capability  could  be  scaled  to  actually  realise  a 
multi-increment  accuracy  consistent  with 


A*x 


max 


Ax 


max 


<  -i-  <  2 
-  230  - 


(VU-4) 


namely,  7  bit  increment  accuracy  with  single  increment  communication. 


VII- 7 


Suppose  that  3  bit  multi- transfer  is  mechanized,  then  the  maximum  frequency 
sinusoid  consistent  with  single  increment  communication  is 


f  max 


1 

Zirr 


A3x 


max 


Ax 


max 


(VII-5) 


At  150  iter/sec,  as  for  internal  computation,  f  max  ~  3  cps.  At  600  iter/sec 
for  6  bit  increment  computation,  using  2  bit  increment  communication  as  for 
input  processing,  f  max  ■  6  cps.  The  maximum  frequencies  apply  to  general 
variables,  whose  major  component  have  the  stated  maximum  frequency.  For 
minor  components  of  perhaps  higher  frequency,  the  reciprocal  of  the  rela¬ 
tive  fraction  of  full  amplitude  of  the  component  affects  allowable  frequency  in 
direct  proportion.  The  highest  frequency  variables,  such  as  in  air  data  or 
strapdown  applications,  have  major  components  at  <  0.  5  cps  and  perhaps 
some  minor  components  at  relativelv  high  frequencies.  While  simplified 
communication  appears  possible  for  input  processing,  the  relatively  small 
number  of  communications  required  does  not  require  use  of  the  simplified 
communication  technique.  Rather,  the  internal  computations  characterized 
by  a  gross  number  of  communications,  rssulting  from  the  typically  large 
program,  may  properly  use  the  simplified  communication  with  significant 
saving  and  also  gsnsrally  be  programmed  for  the  full  accuracy  possible  by  a 
several  bit  transfer  capability  on  the  generally  low  frequency  variables 
assigned. 


7.  3  ALGEBRA  OF  SCALING  OF  A  MULTI-INCREMENT  QDDA  WITH  SEC¬ 
OND  DIFFERENCE  OUTPUTS  OF  THE  QDPU  -  All  DDA  mechanisations 
involve  register  storage  and  communication  of  information  representing 
physical  quantities  (or  problem  quantities  in  a  simulation  computer)  that  are 
subjected  to  arithmetic  operations  of  variable  updating  and  transfer  (condi¬ 
tional  accumulation  to  R  registers).  Interpreting  any  full  register  of  any 
fixed  length,  as  having  a  machine  arithmetic  value  of  1,*  a  given  register. 


VII-8 


having  the  purpoee  of  etoring  a  given  quantity  X,  has  an  associated  physical 

p 

scale  S  ^ ,  which  is  chosen  for  overall  most  accurate  computation  consis¬ 
tent  with  machine  operation  constraints.  In  addition,  to  the  quotient  algorithm 
(developed  for  ternary  communication)  the  computer  evolved  during  the  latter 
portion  of  the  study  generates  (1)  second  difference  outputs  (rather  than  first 
difference  as  in  all  existing  DDA's)  (2)  utilizes  in  each  QDPU  additional 
registers  (first  difference  storage)  for  accumulation  of  ternary  communica¬ 
tions  (single  increment  and  sign),  and  (3)  utilises  multi- increment  updating 
and  transfer  operations  (3  bit  or  6  bit).  The  design  features,  (1)  and  (2), 
which  are  major  new  design  features,  are  chosen  for  overflow  mechanization 
and  communication  hardware  simplicity  in  achieving  (3),  the  latter  having 
been  widely  discussed  in  the-DDA  field,  but  not,  to  our  knowledge,  materi¬ 
alized  in  a  computer.  An  algebra  of  scaling  the  multi -transfer  QDDA  was 
derived  not  only  for  the  basic  purpose  of  programming  application  compu¬ 
tations,  but  also  for  the  exploration  of  the  implications  of  general  oper¬ 
ation  for  optimized  design.  The  choice  of  definition  of  machine  numbers, 
where  registers  contain  maximum  machine  values  of  1,*  makes  the  y  register 
and  independent  variable  register  closely  analogous  to  a  single  increment, 
single  transfer  DOA  where  a  fraction  instead  of  a  *1,  0  is  added  to  a  y  regis¬ 
ter  with  the  same  scale  factor,  and  a  product  of  a  fraction  instead  of  el, 

0  with  an  integrand  quantity  representing  the  independent  variable  increment 
evolves  the  R  register  increment  with  the  same  scale  factor.  The  more 
general  case  of  a  programmable  scale  factor  for  transferred  quantities  was 
analysed  to  better  coordinate  the  several  transfer  actions  in  a  QDPU  (only 
one  transfer  takes  place  in  a  conventional  DDA).  Scaling  of  the  QDPU  outputs 
(second  differences)  for  accumulation  in  first  incrsment  registers  is  a  func¬ 
tion  of  the  overflow  mechanism  which  has  been  developed  and  the  optimal 
system  performance  choice  for  the  physical  variable  represented,  which  is 
consistent  with  the  single  increment  communication  and  transfer  bit  length 
mechanised. 


VII -9 


Analysis  will  use  the  following  general  symbology, 


r(x)  is  the  machine  number  in  the  register 

associated  with  physical  variable  x  where 
full  register  is  considered  to  contain  ±1~ 

is  the  scale  of  x  in  the  register  i.  e. ,  the  physical 
magnitude  of  x  for  which  the  register  contains  +  1* 

Then  in  general  x  *  r  (x). 

Analysis  is  assisted  by  attributing  to  contents  of  the  R  register  the  repre¬ 
sentation  of  a  physical  quantity  R,  in  which  case  R  =  .  r  (R)  where  r(R) 

is  the  R  register  contents.  Analysis  of  multi- increment  computation  with 
2nd  difference  output  where  a  division  process  is  executed,  may  be  simpli¬ 
fied  by  representing  the  updating  of  R  in  terms  of  the  sum  of  normal  trans¬ 
fers  associated  with  one  or  more  integration  processes  AIr  ,  and  the 
transfer  associated  with  division  (by  v)  in  which  case 


R 

n 


=  R 


n-1 


+  A I 

n 


(V1I-6) 


The  quantity  AO^  being  the  updated  first  difference  of  outputs  after  feedback 
n 

as  distinguished  from  A9q  ,  which  is  the  updated  first  difference  including 

present  output.  Thus  AO  *  A  <P  +  A*0  .  The  computation  to  be  executed 

n  n  n 

is 

A0n  *  (VH-7) 

n 


VII- 10 


In  terms  of  machine  register  contents  and  physical  scales  we  have 


SR  'lR»>  *  SR  *  SAI  ’<“»>  *  SvS40  *«V  '  rl^D„'  I™'8' 

Transfers  associated  with  integration  will  be  executed  to  have  the  same  scale 
as  the  R  register,  hence  S^T  =  and 

r(R  )  *  r(R  )  +  r(Al  )  -  K  r(v  .)  r  (A»F  )  (VII-9) 

n  n- 1  n  n- 1  n 


where 


K  ■  SvS^. 


A  F 


The  quotient  algorithm  developed  does  not  involve  direct  "overflow"  but 
rather  an  "  jut  put"  according  to  criterion  with  the  effect  in  computation  of 
overflow  being  produced  by  feedback  of  "output. "  Scaling  of  output,  which 
depends  on  the  output  mechanism,  is  elucidated  as  follows:  For  the  feature 
of  ternary  output  where  zero  is  the  output  (when  minimum  error  is  achieved 
by  doing  so)  the  latter  would  be  appropriate  (without  residue  retention)  if 


AI  -v  AO  , 
n  n- 1  n- 1 


,  ivr°.i 


(VII- 10) 


£ 

for  A  0^  chosen  non-zero.  With  the  desirable  computation  feature  of  residue 
retention,  output  is  zero  if 

|v  I  I  A*0  I 

r  -  Lai ll-i — “1  <0 

n  ‘ 


VII- 11 


Regarding  machine  outputs  as  +1,  -1,  or  0  in  the  convention  of  ternary,  the 
outputs  are  regarded  as  contained  in  single  bit  register  (not  including  sign). 
Then 


A*  O 


Vo 


r(A*0  ) 


(VU-ll) 


where  r(f  Of)  ■  +1,  -1,  or  0.  For  A* O  hypothesised  non-zero,  for  applica¬ 
tion  in  the  output  criterion  inequality. 


hence,  the  teat  inequality  is 


“-'•'VJVo  <o 


(VU-12) 


Expressed  entirely  in  register  quantities  and  scales  involved,  the  test 
inequality  is 

|r(Rn)|-K*  |r(vB-J)J  <0  (VII-13) 

where 

~e  8A*OSv  SA *OSv 

'  ~sr  ■  *», 

The  output  criterion,  in  terms  of  register  and  physical  scale  quantities  is 

r(A*On)  •  Sgn(r(Rn))  sgn(r(vn  l))  u  (|r(Rn)| -K*Jr(vn_,|)  (VII -14) 

providing  Sg,  Sy  >0,  and  taking  u  to  be  the  unit  step  function.  Assuming 
r  (A*0)  is  added  to  r  (AO)  according  to  the  relation 

A  r(A#)  ■  2-*  r  (A*#)  (VII-15) 


VU-12 


obtain 


multiply  both  side*  by  S...  rryr  » 

SA  0 

A  (A  8)  *  Z~*  §|g^  .  A8  8  (VO-16) 

hence  SA80  «  2-*  SA6  (VII-17) 

Then  the  conetante  K  and  K*  for  R  register  incrementation  and  output  genera¬ 
tion  are  seen  to  be  related  by 

K*  ■  2**  SlMl  «.2“#K  (VH-18) 

SAI 

Summarising  the  R  register  incrementation  and  output  generation  equations, 
we  have 


r(R  )  *  r(R  .)  +  r(Al  )  -  K  r(v  .)  r(A®  ) 
n  n-i  n  n-i  n 

r(A8®n)  -  sgn  (»(Ra))sgn  (rfr^))  u  ^(RJ  -2‘*k|  r(vnl)|)  (VU-19) 

where 


K  «  Sv  SA  ® /S  _ 

Ai 

assuming  >0,  SA®  >0  (the  latter  since  SA  ®  »  S^). 

To  elucidate  transfer  relationships  consider  the  case  where  Aln  *  y  •  AX 
where  y  is  the  integration  algorithm  modified  y  of  same  scale  and  where  y  is 
scaled  unity  in  r(y).  Then 

A I  ■  r(y)  r(AX)  SAX  (VI1-20) 

B 

and  S  _  *  S4V  »  S  (VU-21) 

fll  AX 


VII- IS 


Suppose  v  is  also  stored  with  unit  scale  and  SAO  has  the  same  scale  as  Ax. 

namely  S.  a  «  S.  Then  K  «  a  j  amj  the  R  register  incrementation 

A  8  Ax 

equation  in  this  case  is 


'<Rn-l> 


t 


+  S  |  r(y)  r(A*n)  -r  ( v n_j)  r(A6n~)  |  (Vll-22) 


'] 


with  output  criterion 


(VII-23) 


r(A*«n)  -  egn(r(Rn_j))  sgn  (rtvn„j))  ^Jr(Rn^|*’2_*|r(vn»l^|^ 

The  transfer  operation  for  integration  and  division  operations  are  identical  ex¬ 
cept  for  sign  in  this  ease*  for  any  choice  of  scale  of  SA*0  provided  the  output 
criterion  is  adjusted  so  that  s  satisfies 


SA*0  »  2""  •  S 


(Vll-24) 


Actually,  for  A*  9  to  have  a  single  increment  representation  satisfying  a  choice 
of  SA*  0  the  physical  variable  A* •  must  be  known  to  be  rate  limited  to  the  ex¬ 
tent  that  a  fractional  change  in  A  •  in  one  iteration  does  not  exceed  2~* . 


7.  3  GENERAL  (QUOTIENT)  ALGORITHM  FOR  THE  QDD?A.  BASED  ON 
THE  NUMERICAL  COMPUTATION  ANALYSIS  -  The  analysis  of  Chapter  II 
led  to  the  quotient  algorithm  for  whole  word  computation  in  which  outputs 
A*  0  are  second  differences  generated  using  the  calculation* 


R 

n 


•  R 


n-1 


+  P 

a 


*Xn 


A*n 


(V1I-25) 


where  (~)  means  a  variable  is  algorithm  modified  and  where  the  desired 
calculation  is 


nT 

(n-l)T 


E*L 


v 


(VU-26) 


VII-14 


Consider  along  with  this  computation  the  more  general  computation  (readily 
obtained  by  generalizing  the  analysis  of  Chapter  II)» 


R 

n 


n-1 


+  P  AX 
n  n 


+ - 
n  n 


V  A0  -V  ,Aa0 
n  n  n-1  n 


(VII -2  7) 


where  the  outputs 


Aa0  are  second  differences  and 


A0_ 


(  nr  pdx  +  qdy 
)( n-l)r  v 


When  v  =  1  this  computation  reduces  to 


R 

n 


n-1 


+  P_  Ax  +  q  Ay  -  (A0  +  Aa  0  ) 
u  d  n  n  n  n 


(VU-28) 


(VII-29) 


where 

'  5"  -1»T  *  «<*  t™-30' 

In  mechanization  for  multi -increment  computation  the  later  computation  re¬ 
quires  two  multi -transfer  units  for  parallel  operation  (one  for  p  Ax  and  one 

n  n 

for  q  Ay  )  and  one  single  transfer  for  (A®  +  A*  0  )  as  will  be  discuesed  later, 
n  n  n  n 

The  same  multi -transfer  unit  requirements  hold  for  the  quotient  algorithm 
first  stated. 


The  application  studies  of  Chapter  XI  show  that  the  calculations 

te  .  CBT  EfiL 

n  )(n-l)T  y 

“  $r»T-l>T  ***  +  #r 


(VU-31) 
( V1I-32) 


include  the  majority  of  basic  calculations  desired  in  an  incremental  computer. 
The  fact  that  multi -transfer  unit  requirements  in  parallel  computation  are  the 
same  for  the  two  calculation  types*  implies  that  a  basic  unit  which  can  be  pro¬ 
grammed  for  either  operation  can  be  efficient  most  of  the  time  during  the 


VII- 15 


computation  cycle  in  executing  a  typical  application  program,  since  the 
arithmetic  capability  of  the  unit  is  fully  used  most  of  the  time.  The  first 
stated  algorithm  can  be  put  in  the  form 


R 

n 


=  R 


n-1 


Ax  +  v 


(•(‘'.'‘■'I 


+  Av  Aa  ® 
n  n 


(VII-33) 


in  which  the  total  multi -transfer  and  single  transfer  requirements  are  the 
same  as  the  sum  of  integrals  algorithm.  A  unit  which  is  programmable  for 
both  the  operations  of  form 

R„  ■  R„-l  +  “a  +  V”  ♦[-(*•» 

and  the  above  has  the  basic  computation  capability  discussed.  The  general 

as# 

theory  calls  for  integration  algorithm  of  component  terms  (indicated  by  (  )) 
which  depends  purely  on  the  lagged  or  unlagged  nature  of  the  variable,  for 
which 

(  )  ■  (  L  -  1  /2A<  )  -  1/12A*(  )  Unlagged  Variable  (VH-35) 

n  n  n  n 

(  )  ■  (  )  +  1/2AC  )_  +  5/12 A* (  >  Lagged  Variable  (VU-36) 

n  n  zi  u 


On  the  basis  of  a  limited  number  of  simulations  the  approximation  of  second 
order  algorithm  by 

(  )  ■  (  )  -  1/2A(  L  Unlagged  Variable  (VII-37) 

on  n 

*0 

()«()+  1/2 A(  )  ♦  1/2A*(  )  Lagged  Variable  (VU-38) 

Du  Q 

is  proposed. 

The  integration  algorithm  programming  of  the  two  forme  of  computation  con¬ 
sidered  is  therefore  identical*.  The  programming  of  assignment  of  independent 

•The  term  Av  A*  9  ,  of  second  order, is  replaced  by  Av  A*  0. 

no  n  n 


VII- 16 


variables  for  multi -transfer  may  be  considered  identical  in  the  two  calculation 


types  if 
bles 


£-^A9  +  is  i 


included  in  the  set  of  programmable  independent  varia- 

which  case 


for  multi -transfer.  Note  that  .either  Ay  d  +  A*®  S"Lin 

£-^A9n  +  A3  0nj  *-  * - ' - J  --  -* 


should  be  single  transferred,or 

+  A*® 


[•(4,n  *  4*Sa)] 


(VII -39) 


in  which  case  Av  •  A3®  should  be  single  transferred.  It  may  be  deduced 
that  a  generalized  unit  capable  of  both  calculation  types  is  programmable 
within  the  framework  of  that  required  for  the  sum  of  integral  calculation  type, 
adding  somewhat  to  diode  requirements  but  not  flip  flop  requirements,  lh 
the  two  output  QDD®  A,  the  valuable  added  facility  for  three  multi -transfers 
to  Rj  and  one  multi -transfer  to  (a  modified  allocation  of  multi -transfer 
units)  does  not  affect  the  mechanisation  structure  for  transfers  involving  fed- 
back  output. 


7.  3  QDD®  A  DESIGN  FOR  INPUT  PROCESSING  -  INTERNAL  COMPUTATION 
TASKS  OF  FULL  AEROSPACE  MISSION  -  The  computation  task  for  a  fuU 
aerospace  mission  involves  capability  of  executing  input  processing  as  well 
as  internal  computation.  A  ODD®  A  designed  solely  for  several  bit  increment 
computation  has  a  degree  of  input  processing  capability  determined  primarily 
by  chosen  multi -increment  bit  length  and  parallel  computation  features.  Anal¬ 
ysis  of  Chapter  X  indicates  that  internal  computation  requires  no  more  than 
several  bit  increment  computation  at  even  the  modest  iteration  rates  imposed 
by  large  program  and  considerable  time  shared  operation  for  input  processing. 
In  a  practical  sense  then,  it  is  wasteful  to  mechanise  more  than  three-bit 
computation  for  internal  computation.  We  seek  to  apply  the  concepts  of 
Chapter  IV  to  the  developments  following  it.  The  mechanisation  requirements 
for  input  processing  are  very  different  from  those  for  internal  computation 
at  all  but  very  high  iteration  rates  for  input  processing  assuming  intermediate 


VII- 17 


rate  for  internal  computation.  A  time  shared  design  for  input  processing  at 
high  iteration  rate,  chosen  for  mechanisation  economy,  meets  accuracy  re¬ 
quirements  (with  several  bit  increment  computation)  provided  the  input  pro¬ 
cessing  program  is  very  small  so  that  the  high  iteration  rate  can  be  achieved 
without  imposing  intolerably  low  iteration  rate  for  internal  computation.  In 
view  of  the  considerable  numbers  of  routines  of  character  intermediate  be¬ 
tween  input  processing  and  internal  computation  (such  as  air  data  computa¬ 
tions),  which  are  not  practical  at  high  iteration  rate,  the  ability  to  handle  a 
number  of  demanding  computations  at  intermediate  iteration  rate  has  real 
value.  Two  new  approaches  have  been  investigated  in  achieving  the  total  set 
of  appropriate  computation  sophistications  in  a  single  computer  so  that  itera¬ 
tion  rate  (and  accuracy)  is  maximum  for  a  given  level  of  mechanisation  com¬ 
plexity.  The  first  approach  is  introduced  in  Chapter  IV  and  further  discussed 
in  Chapter  VIZI,  namely,  the  double/single  precision  computation  capability 
in  which  two  3 -bit  increment  multipliers  capable  of  parallel  computations  at 
twice  the  rate  of  serial  computation  may,  by  programming,  also  act  at  designated 
word  times  as  a  six -bit  multiplier  for  "double"  precision  (as  required  at 
intermediate  iteration  rates,  for  example,  in  air  data  calculations).  With  this 
design  feature,  demanding  calculations  (usually  comprising  only  a  small  part 
of  the  total  program  but  too  numerous  to  be  included  in  high  rate  input  pro¬ 
cessing)  may  be  executed  with  required  precision  without  slowing  down  the 
rate  of  internal  computation  in  the  same  computation  loop,  which  has  program 
extent  for  full  aerospace  mission  in  certain  cases  of  hundreds  of  integrators. 

The  second  new  arithmetic  unit  design  approach  derived  in  Chapter  XU, 
should  act  be  regarded  as  essential  to  efficient  QDE^  A  design.  The  approach 
utilises  however,  the  basic  organisation  features  of  the  QDO^  A  with  second 
difference  computation.  The  design  presented  in  Chapter  XU  requires  that 
inputs  be  single  increment  second  differences.  In  applications  where  this  is 
consistent  with  scaling  the  derived  multiplier,  the  D*  multiplier,  (having 


VU-ll 


approximately  the  complexity  of  a  3 -bit  increment  multiplier)  can  act  ae  a 

many -bit  increment  multiplier.  For  example,  an  analytic  faction  such  aa 

ein  wt  can  be  computed  in  10-bit  increment  etepa(for  w  *  1  ~  —  )  and  one  part 

2  T 

in  one  million  accuracy  ueing  the  O  multiplier.  For  aerospace  applications, 
however,  certain  problems  are  anticipated  in  exploiting  the  principle  on  which 
the  D2  multiplier  design  is  based.  Here  the  major  application  of  system  value 
would  be  input  processing,  however,  the  digital  analog  form  of  inputs  for 
pulse  stream  converters  and  the  noise  and  rate  character  make  the  predica¬ 
tion  of  single  increment  second  differences  of  sufficiently  fine  resolution  un¬ 
certain.  The  nature  of  high  rate  incremental  pulse  stream  input  information 
representing  first  differences  itself  leads  to  two-bit  second  differences  by  any 
direct  conversion  method  for  constant  rates.  The  approach  of  forcing  the  in¬ 
troduction  to  the  digital  computer  of  single-bit  second  differences  (by  a  spe¬ 
cially  designed  servo  type  preprocessing  unit)presents  the  introduction  of 
non-linear  lag  effects  which  can  introduce  significant  errors  degrading  effec¬ 
tive  computation  algorithm. 

Two  fundamental  approaches  to  exploiting  the  principle  of  the  multiplier 
in  real  time  computers  with  sensed  inputs  are  presented.  One  calls  for  the 
design  of  a  hybrid  multiplier  which  is  partly  conventional  for  lower  significant 
increment  bits  and  similar  to  D2  operation  for  the  remaining  increment  bits. 
Provided  input  signal  and  noise  are  scaled  correctly,  then  a  two-bit  or  three - 
bit  second  difference  input  could  be  computed  without  preprocessing.  The 
D2  multiplier  does  not  generalise  directly  for  two-bit  second  differences  in 
a  mechanisation  of  practical  value.  The  second  approach  to  exploiting  the 
principle  of  the  D2  multiplier  is  the  generation  of  second  differences 
by  second -differencing  of  whole  word  sampled  values.  A  sampler 
unit  of  sufficient  word  length  will  generate  s  ingle  -increment  second 
differences  of  given  resolution,  provided  the  input  to  the  sampler  is  consistent 
with  their  scale.  For  a  significant  increase  in  resolution  over  that  of  a 


VII- 19 


conventional  3 -bit  multiplier  of  the  same  cost  as  the  D  multiplier,  the 

.4 

sampler  word  length  must  be  typically  of  »  16  bits  9  >.  04  x  10  since,  for 
example,  at  100  iter  sec.  (intermediate  rate)  the  QDD^A  with  conventional 

_4 

multiplier  can  compute  with  10  resolution  on  typical  high  rate  inputs  where 
second  differencing  of  inputs  requires  2  additional  bits  to  eliminate  round  off 
generation  of  non-single  bit  second  differences  inputs.  With  the  double  pre¬ 
cision  feature,  which  couples  two  conventional  multipliers,  a  resolution  of 

-5  -6 

10  is  assured  at  intermediate  iteration  rate  and  10  high  rate  loop.  Exist¬ 
ing  sensors  and  transducers  for  real  time  computation  with  voltage  or  current 

_4 

sampling  typically  have  resolution  of  <  10  and  shaft  encoder  conversion 

*4  -6  2 

with  resolutions  of  10  to  10  Therefore  the  D  multiplier  offers  possible 
hardware  saving  in  arithmetic  unit  complexity  on  the  basis  of  input  character¬ 
istics,  primarily,  when  coupled  with  encoder  converters.  More  specifically, 
a  significant  gain  would  probably  be  effected  only  with  ultra  high  precision 
angle  measuring  devices  such  as  the  Microsyn  for  star  tracking  systems.  As 
a  result  of  the  trend  to  single  telescope  systems,  the  measurement  of  stellar 
intercepts  are  discontinuous  from  one  star  to  the  other.  These  intercepts 
represent  first  differences  in  navigation  error  correction  computations, 
therefore,  the  second  differences  are  discontinuous.  This  presents  another 
problem  in  utilising  the  D*  multiplier  (but  not  the  conventional  multiplier)  in 
the  apparently  most  attractive  area  of  application.  The  cost  of  forming  sec¬ 
ond  differences  is  comparabls  to  the  difference  in  cost  between  a  3-bit  and 
5-bit  conventional  multiplier.  The  considerations  of  reliability  of  operation 
in  systems  subject  to  high  frequency  electrical  transients  serve  to  limit  the 

Z 

D  multiplier  application.  All  of  the  factors  imply  effective  loss  of  resolution 
of  the  sensor  in  multiplier  computation  by  a  factor  of  2  to  3. 

Where  input  processing  is  a  fundamental  computation  task  in  the  application, 
the  firmest  basis  of  system  design  in  the  light  of  this  is  ef  the  slagls / 
double  precision  arithmetic  unit  or  a  cewesHossl  multiplier  /  D"  multiplier 
combination  in  double  precision  for  input  processing  calculations. 


VII- 20 


CHAPTER  VIII 


TYPES  OF  PROGRAMMABLE  MODAL  ACTION  OF  THE  FULL  SCALE 
INCREMENTAL  COMPUTER  IMPLIED  BY  COMPUTATION  TASK 
AND  MECHANIZATION  FACTORS 

8.  0  PURPOSE  OF  PROGRAMMABLE  MODAL  ACTION  -  The  major  factor 
in  the  selection  of  a  particular  programmable  modal  action  is  the  quantitative 
computation  capability  attained  for  a  given  level  of  mechanization  complexity 
implied  in  doing  so.  The  underlying  prerequisites  for  system  task,  including 
input  processing  and  internal  computation  (analyzed  throughout  major  portions 
of  this  report)  are  assumed  here,  and  analyzed  in  relation  to  the  underlying 
hardware  characteristics.  This  makes  the  total  set  of  these  functions  pos¬ 
sible  by  modal  action  of  arithmetic  module  and  internal  communication 
processes  of  the  basic  differential  processing  unit,  the  QDPU. 

8.  1  ARITHMETIC  MODULE  MODES 

A.  General  Factors  Implying  Arithmetic  Module  Modes  -  Consider 

the  general  relations  of  a  computer  with  arithmetic  modal  features 
to  conventional  computers.  If  we  consider  lack  of  continuous  full 
use  of  arithmetic  capabilities  of  given  hardware  in  a  computer  a 
design  deficiency,  then  a  clear  deficiency  of  contemporary  incre¬ 
mental  and  general  purpose  computer  designs  exists.  The  defi¬ 
ciency  stems  from  their  processing  rate  and  precision  inflexibility 
for  subroutines  presenting  different  types  of  computation  demands. 

For  example,  while  the  G.  P.  is  designed  to  carry  out  a  computa- 

-8 

tion  at  fixed  rate  with  an  accuracy  of  10  ,  a  computation  requir¬ 

ing  only  10~*  accuracy  can  be  carried  out  at  no  higher  rate.  A 
computer  with  the  same  arithmetic  complexity  could  be  designed 
to  achieve  twice  the  processing  rate  for  such  routines.  But  since 
the  other  portions  of  computation  task  require  the  higher  accuracy, 


VIII- 1 


the  conventional  rationale  is  to  accept  the  lose  of  efficiency. 

Actually,  since  the  typical  application  requires  few  high  accuracy 
or  difficult  computations  and  a  majority  of  low  accuracy  or  easy 
computations,  the  loss  through  functional  inflexibility  is  generally 
large.  The  historical  lack  of  problem  attack  on  this  design  defect 
has  two  partial  explanations  with  regard  to  design  philosphy: 

1.  The  widespread  impression  of  the  fundamentality  of  the  arith¬ 
metic  unit;  i.  e. ,  that  tampering  with  it  in  design  efforts  except 
as  a  unit  cannot  be  done. 

2.  The  possibility  of  modal  or  switching  action  costing  more  to 
implement  than  the  savings  gained. 

Actually,  in  answer  to  (1),  the  fact  that  double  precision  can  be 

programmed  in  a  G.  P.  (though  with  inordinate  processing  rate 

-16  -8 
loss)  for  10  ,  for  example,  instead  of  10  accuracy,  shows 

that  a  2 hi  bit  accuracy  multiplier  is  no  more  fundamental  than 

two  individual  hi  bit  accuracy  multipliers.  With  regard  to  (2),  it 

is  shown  in  later  analysis  that  the  double /single  precision  pro- 

a 

gramma bllity  in  thsQDO  A  costs  little  more  than  a  flip-flop  while 
doubling  effective  processing  rate  in  lower  accuracy  computations. 
Using  conventional  parallel  design  principles  to  equal  this  rate,  the 
hardware  cost  would  be  nearly  an  order  of  magnitude  higher  than  by 
the  arithmetic  modal  design, 
s 

In  the  QOD  A  the  advantages  of  modal  arithmetic  action  are  exploited 
in  a  natural  and  efficient  manner  because  the  external  communica¬ 
tion  structure  between  QDPU  and  external  inputs  is  complete  and 
adequate  for  double  precision  as  well  as  single  precision  compu¬ 
tation.  The  modal  operation  of  pairing  several  bit  transfer  units 


VIU- 2 


to  act  a*  a  many  bit  transfer  device  has  been  evaluated  for  both 
conventional  multiplier  unit  designs  and  the  new  A8  multiplier 
developed  in  the  chapter  on  logical  design.  In  reference  to  the 
drum  memory  case*  the  transfer  for  the  added  significant  bits  of 
the  independent  variable  selectively  stored  in  the  double  precision 
mode  in  a  flip-flop  already  present  is  properly  scaled  by  several 
bit  time  delays.  These  delays  are  obtained  by  selective  feed  in 
of  bits  to  flip-flops  for  the  second  single  precision  transfer.  A 
coupling  of  one  conventional  B  bit  transfer  and  multiplier  (of 
the  same  complexity  which  often  acts  as  a  4  or  5  bit  multiplier) 
is  described  in  the  following  paragraphs. 

B.  Transfer  Operation  in  Double  Precision  Mode  by  Coupling  of 

2 

Conventional  Multiplier  and  the  Developed  D  Multiplier  -  The 
conventional  multi -transfer  mechanisation  for  AX  which  is  4 
bits  (including  sign)  and  ths  developed  multiplier  have  essen¬ 
tially  the  same  flip-flop  requirements.  When  the  second  difference 
i*X  is  known  to  have  a  sufficiently  small  maximum  ths  D^  multi¬ 
plier  can  act  as  a  many  bit  multiplier;  e.  g. ,  a  sinusoid  of  fre¬ 
quency  f,  constant  angular  rate,  can  be  computed  with  multi¬ 
plier  at  iteration  rate  (IF),  with  A*  X  of  log,  (IR/2  rrf)  bits  not 
including  sign,hence  at  100  iter/sec,  a  .  125  cps  sinusoid  can  be 
computed  with  an  8  bit  AX  (including  sign)  as  comparsd  to  the  con¬ 
ventional  4  bit  AX  multiplier  of  about  the  same  complexity.  Input 
proceseing  presents  insurmountable  granularity  problems  to  a 
conventional  DDA,  and  except  at  very  high  iteration  rates  also  to 
a  several  bit  increment  DDA.  Thus, in  input  processing, the  value 
of  many  bit  increment  computation  is  evident.  A  programmable 
double  precision  mode  has  the  value  of  meeting  this  need  for 


VU1-J 


demanding  computation  routine  a.  This  makes  possible  double 
rate  computation  for  the  majority  of  computations  which  are  not 
so  demanding  but  largely  make  up  the  bulky  program  for  full 
aerospace  mission.  Input  processing,  especially  at  low  rate,  is 
characterised  by  comparatively  large  A*  X  max/ AX  max  ratio.  The 
large  ratio  implies  that  the  multiplier  may  be  dependent  upon 

attaining  in  the  worst  case  only  several  bit  increment  computation. 

2 

Double  precision  mechanisation  studies  show  that  the  two  D  multi¬ 
pliers  do  not  readily  couple  to  increase  resolution  by  more  than 
one  bit.  On  the  other  hand,  a  conventional  3  bit  increment  and 
D2  multiplier  can  be  coupled  to  allow  for  4  bit  increment  in  A*  X, 
increasing  the  multi -increment  precision  of  a  double  precision 
mode  by  a  factor  ot  16.  Consider  the  representation  of  a  many 
bit  AX  in  terms  of  truncated  and  residue  components: 

AX  •  AX  +  AXr 
n  n 


4  bit  residue  case 


VIH-4 


For  a  A2X  which  is  known  expressible  as  a  4  bit  residue,  it 
is  seen: 


AaX  =  A3XD 
n  R 


(vrn-i) 


Updating  of  AX  is : 

n 


AX 


n+1 


AX  +  A*X 
n  n 


+  A3  X. 


[>„  * 
axt  ♦  [axr  +  a*xr  1 

n  n  nj 


A  single  carry  C  results  in  adding  residues, hence: 

A 


(vrn-2) 


+  i#XR  “  *XR  *  Cn+1 
u  n  n+1 


(VIII-  3) 


Thus,  the  equality: 


AX 


n+1 


■  [  “T.  *  * 


AX. 


n+1 


(VUI-4) 


Where  the  integrand  is  y  and  the  algorithm  corrected  quantity 
involved  in  transfer  is  y«then: 


(VUI-5) 


is  to  be  transfered  for  double  precision.  A  conventional  3  bit  multi¬ 
plier  can  transfer  y  AX_  .  The  0*  multiplier  can  transfer 

n+1 


VlU-5 


y  The  O  multiplier  can  transfer  y  AX^,  +  Cq+j 


n+1 


*XT  +  Cn+l" 
n  __ 


employing  in  the  earn*  manner  as  A*y  in  eingle  precision 
mode.  The  conventional  and  operate  as  a  coupled  double  pre¬ 
cision  multiplier  by  programming  integrands  to  be  identical  and 
delaying  start  of  the  multiplier  by  4  bit  times.  The  latter  is 
effected  using  the  8  update  mechanisation  in  which  certain  8 
quantities,  being  updated  before  others  (with  savings  in  channel 
and  arithmetic  requirements),  are  stored  in  the  communication 
core  memory  (total  core  count  10  to  12  words).  The  time  of 
drawing  out  the  core  stored  t  quantities  differs  in  double  precision 
from  that  of  single  precision  by  4  bit  times. 


8.  2  INTERNAL  COMMUNICATION  MODES  OF  THE  BASIC  PROCESSING 
UNIT  (QDPU)  •  The  evaluation  of  the  QDPU  internal  communication  modal 
design  approach  is  more  complicated  than  the  obviously  powerful  arithmetic 
modal  design  approach  analysed  in  the  preceding  paragraphs.  Evaluation 
should  be  relative  to  conventional  parallel  DDA  designs  since  only  a  parallel 
processing  unit  can  match  the  QDPU  in  computation  capability,  the  level  of 
which  is  established  as  necessary  for  pertinent  applications.  Sbice  the  QDPU 
carries  out  several  operations  together  using  internal  communication  (within 
the  QDPU  rather  than  QDDA  as  a  whole)  while  a  parallel  DDA  integrator  does 
not  (within  the  DDA  integrator  hut  rather  within  the  DDA  as  a  whole),  the  de¬ 
gree  of  overall  external  communication  is  expected  to  be  Use.  In  application 
programming  analyses,  the  number  of  external  outputs  (counted  once)  com¬ 
municated  is  at  most  one-half  that  of  a  parallel  DDA  with  comparable  program 
capacity  (hut  much  lower  computation  capacity  then  the  QDDA):  also,  though 
less  useful  it  is  found  that  input  sinks  selecting  outputs  is  at  most  three- 
quarters  that  of  the  DDA.  Rapid  access  (core)  memory  for  communication  is 


vm-t 


reduced  by  one-half  on  this  basis,  and  another  factor  of  one-half  or  more 
using  the  new  second  difference  communication  developed  in  the  chapter  on 
multi -inc rement  computation,  with  communication  selection  requirements 
for  full  communication  reduced  by  more  than  one-third.  Thus,  substantial 
hardware  savings  are  made  possible  by  internal  communication  processes  of 
the  QDPU  and  serve  to  compensate  the  modest  cost  of  these  modal  features. 

A  second  important  hardware  saving  made  possible  by  internal  communication 
modality  is  the  saving  in  the  number  of  registers.  For  pertinent  comparison, 
the  parallel  DDA  without  division  algorithm,  requires!  with  drum  memory, 
three  extra  channels  (for  register  information)  and  associated  read  write  up¬ 
date  logic,  or  with  core  memory,  over  one  hundred  and  fifty  words  more  of 
core  memory. 

Because  of  the  large  gains  in  computation  capability  made  possible  by  reduced 
R-register  count  and  quotient  algorithm  in  the  QDDA,  a  conventional  parallel 
DDA  with  actually  matched  computation  capability  would  require  a  higher 
iteration  rate  obtained  by  further  parallelity  and  further  cost  than  indicated 
above. 

8.  3  DECISION  OPERATIONS  IN  THE  MECHANIZATION  OF  MULTI¬ 
INCREMENT  QDDA  WITH  SECOND  DIFFERENCE  COMMUNICATION 

A.  Introduction  -  ft  is  essential  that  the  QDDA  have  the  decision  capa¬ 
bilities  of  the  conventional  DDA  in  order  that  supervision  by  G.  P. 
be  nil  or  held  to  a  minimum  in  decision  modified  calculations.  As 
a  result  of  the  special  features  of  second  difference  communication 
the  decision  mode  design  problem  is  quite  different  than  in  a  con¬ 
ventional  DDA,  though,  of  course,  the  fundamental  approaches  in 
generating  decision  modified  functions  must  be  used.  In  the  con¬ 
ventional  DDA,  conditional  cutoff  of  an  integration  with  respect 
to  time  is  readily  produced  by  replacing  time  by  decision  function 
(programmed  as  integrator  input  for  independent  variable).  In 


VIII -7 


the  QDDA  the  decision  function  cannot  be  regarded  as  a  normal 
input  since  the  latter  is  generally  a  second  difference,  not 
directly  used  in  multi- transfer*.  Secondly,  if  the  multi  - 
transfers  to  R  were  conditionally  cut  off  as  poasibldihna  * 
conventional  integrator,  the  output  cutoff  does  not  cut  off  inte¬ 
gration  in  the  QDDA  since  the  QDDA  outputs  are  second  differences. 
Despite  the  problems  indicated  by  these  initial  observations  a  de¬ 
cision  capability  was  designed  into  die  QDDA  which  on  balance 
actually  increased  the  "integrator"  equivalent  of  the  QDPU  relative  to 
conventional  DDA  over  the  estimates  previously  based  on  contin¬ 
uous  operations.  The  modest  hardware  requirement  to  accomplish 
the  decision  features  is  comparable  to  that  in  the  conventional  DDA. 


B. 


Decision  Command  Generation  Parallel  with  Input  Processing  - 
The  generation  of  a  decision  command  function  of  the  simplest 
type  requires  only  'Mae  DDA  tntefMMrl**  In  order  Co  retaid  QDPu 
equivalent  to  fourpfcta  DDM integswdneq,  the  decieion command  fuse* 
tdrnua  ear  best  ^aerated'  In  QDPU,  'OtteHW  mwit  tlWHrpeilbmig 
a  substantial  task  in  generating  some  other  computation  for  out¬ 
put  of  the  parallel  channel.  Since  decision  command  signals  are 
generated  bysd|MMa|the  sign  of  a  y  register  clearly  no  multi¬ 
transfer  action  is  used.  This  suggested  die  QDPU  in  aninputpro- 
cessing  mode  which, in  generating  the  sum  of  two  multi-increments 
of  double  precision,utilis*  the  entire  instantaneous  arithmetic 
capability  of  the  computer.  Clearly,if  die  decision  command  func¬ 
tion  of  the  simplest  type  is  generated  in  parallel  with  input  process¬ 
ing  calculations  the  integrator  equivalent  has  increasedby  one  "inte¬ 
grator."  Since  input  processing  generally  involves  at  most  two 


internal  computer  generated  inputs  (and  two  external  inputs)  the 
programming  of  a  test  function  as  sflMIqpttiktpMMhiiB 


VHI-8 


normal  programming  structure.  The  test  function  x  becomes  d^x 
followed  by  normal  ft  programming  structure.  The  test  function 
x  becomes  d^x  followed  by  normal  ft  register  and  y  register  up¬ 
dating  where  in  the  simplest  decision  type  y  was  initialized  at 
-xq,  which  then  leads  to  decision  function  output  D  of  D  *  1  when 
x  >  D  ■  0  when  X  <Xq.  Since  input  processing  is  executed  at 

high  rate,  and  internal  computation  at  moderate  rate,  test  varia- 
2 

ble  d  x  actually  enters  the  high  rate  loop  at  the  low  rate.  Consider 
for  example,  the  case  where  the  rate  ratio  satisfactory  for  input 
processing  is  4  (though  in  double  precision  the  use  of  intermediate 
rate  is  more  typical).  In  this  case,  the  effective  scaling  of  d2x  to 
x  is  increased  by  4  since  dx  updates  x  at  4  times  the  rate  of  the 
internal  computation  loop.  The  decision  command  variable  gener¬ 
ated  at  high  rate  is  used  only  one  fast  iteration  out  of  four  in  the 
internal  computation  loop.  The  multi -Iteration  rate  feature  causes 
no  difficulty  in  generating  decision  command  signals  for  internal 
computation  or  input  proceasing.  The  operation  is  essentially 
free  in  the  simplest  decision  contyhand  type  because  the  processing 
rate  is  unchanged  when  full  arithmetic  capability  is  devoted  to 
input  processing. 

Input  processing  and  decision  command  operations  are  executed 
in  parallel  when  a  decision  command  operation  is  required  in  the 
program  code  of  the  QOPU.  Since  the  decision  generation  does  not 
need  feedback  to  the  4^  register  as  required  by  the  algorithm  for 
multi -increment  computation  with  second  difference  communica¬ 
tion,  which  is  used  in  input  processing  (and  internal  computation), 
the  ft^  register  is  available  for  use  in  input  processing.  The  input 
program  code  has  two  bits  which  determine  the  6  register  in 
which  is  placed  an  Input  of  given  address  in  rapid  access  com¬ 
munication  memory.  Since,  normally,  only  three  A  registers 


VIU-9 


have  programmed  inputs  the  normal  code  can  select  an  input  to 

0^  with  elaboration  of  the  code.  In  the  case  of  input  processing, 

this  fourth  input  register  has  definite  utility  because  input  pro* 

cessing  can  call  for,  on  occasion,  the  combination  of  disparate 

variable  pairs  in(usually)pure  integration  processes.  In  the  case 

of  internal  processing,  the  extensive  programming  studies  show 

that  only  three  input  0  registers  are  required  for  essentially  full 

versatility.  Since  the  0  register  is  required  for  algorithm  pur- 

poses,  a  saving  of  one  channel  results  together  with  the  input 

arithmetic  requirements.  Thb  latter  comment  presumes  that  no 

additional  arithmetic  requirement  is  necessary.  The  0  register, 

m 

during  input  processing,  is  applicable  only  if  a  single  input  is  al¬ 
lowed.  In  navigation  equations  and  thrust  cutoff,  this  limitation 
presents  no  difficulty.  Actually,  if  programing  limitations  were 
encountered  the  unused  output  mechanism  of  the  decision  channel 
to  accomplish  multi -input  capability  to  without  cost  in  flip- 
flops  would  be  relatively  simple. 

C.  Decision  Response  Modes  -  Upon  absorption  of  decision  command 
signals  in  the  QDPU,  decision  action  of  conditional  transfer  takes 
place  using  simple  logical  "and"  operations.  Absorption  of  de¬ 
cision  command  signals  into  the  QDPU  has  a  simple  mechanisation 
which  obtains  highly  efficient  QDPU  operation.  The  total  number 
of  analytic  variables  (as  distinguished  from  decision  variables) 
which  are  concomitantly  abeorbable  is  reduced  only  one,  from 
eight  to  seven,  while  what  is  most  important,  the  analytic  varia¬ 
bles  may  be  collected  in  all  three  registers.  When  the  QDPU  has 
decision  programming  bits  (two  in  number)  indicating  it  is  to  have 
action  in  one  of  several  alternative  decision  modes,  the  address  of 


VIII- 10 


input*  to  the  register  determine*  whether  one  of  two  or  more 
variable*  i*  to  be  treated  a*  a  decision  variable,  the  remainder 
to  be  treated  in  the  normal  manner  for  analytic  variable*  in  up¬ 
dating  the  register.  The  decision  command  bit  D  is  then  used 
as  the  conditioning  variable  for  any  of  the  several  alternative  con¬ 
ditional  transfers  of  6  contents  to  y  ,  yi  registers.  All  condi- 

m  m 

tional  operations  obtainable  in  conventional  DDA  may  be  obtained 
by  conditional  transfers: 

1.  6  to  y  unless  D  ■  0,  which  enables  function  limiting. 

m  m 

2.  6  to  y,  unless  D  ■  0,  and  unconditional  transfer  to  6  , 

m  71  m 

which  enables  cutoff  of  updating  of  a  variable  with  respect 
to  one  of  its  components. 

5-  ‘m“’rm“DO>  -‘m  *°  *»  “  D  ’  O' 

4.  iy,,  transfer  to  R  according  as  D  ■  1  or  0,  enabling 
sign  control. 

The  fact  that  any  on*  of  the  simply  mechanised  decision  response 
modes  conditions  the  otherwise  ordinary  fully  programmable 
action  of  the  QDPU  implies  an  integrator  equivalent  of  the  unit 
that  is  greater  than  the  average  during  such  modes.  Examples  of 
programming  the  QDPU  in  decision  modes  for  dopplsr  damping 
and  thrust-cut  applications  ar*  analysed  in  die  chapter  on  appli¬ 
cation  programming  and  evaluation  of  the  QODA. 

8. 4  MULTI-ITERATION  RATE  FOR  INPUT  PROCESSING  AND  INTERNAL 
COMPUTATIONS  -  The  development  during  this  contract  study  of  an  arith¬ 
metic  module,  capable  of  alternating  the  work  load  between  input  processing 
(double  precision)  and  internal  computation  (single  precision),  enables  a 
marked  savings  in  hardware  by  time  sharing.  Tbs  basic  computation 


vin-ii 


requirements  of  input  processing  and  internal  computation  may  imply  not  only 
different  multi-transfer  requirements,  but  in  certain  critical  input  processing 
applications  may  imply  different  iteration  rates.  The  QDDA  can  be  pro¬ 
grammed  to  obtain  single  and  double  precision  as  desired.  To  obtain  efficient 
different  iteration  rates  for  the  two  types  of  calculation  for  a  drum  memory 
computer,  the  appropriate  allocation  of  read  and  write  heads  and  some 
switching  logic  is  required.  Input  processing  calculations  which  require  a 
relatively  high  iteration  rate  derive  this  requirement  not  only  from  the  high 
frequency  character  of  the  computer  inputs,  but  also  from  a  high  computa¬ 
tion  error  sensitivity  where  high  accuracy  is  required.  In  general,  compu¬ 
tation  error  sensitivity  stems  from  computation  routine  metastability  or  long 
term  instability;  i.  e. ,  the  accumulative  rather  than  damped  error  response. 
Because  metastability  and  instability  stem  from  a  feedback  character  of  the 
calculations,  it  is  generally  true  that  a  sensitive  input  processing  calculation 
is  essentially  isolated  in  generation  from  internal  computations,  which  in 
essence  simply  utilise  the  results  of  the  former.  Evidently,  a  sensitive  input 
processing,  requiring  a  relatively  high  Iteration  rate  M  times  that  of  the 
internal  computation  need  for  acceptable  accuracy,  probably  only  communi¬ 
cates  with  the  latter  at  the  lower  iteration  rate  with  instantaneous  rather 
than  accumulated  outputs.  Thus,  in  cases  where  the  QDOA  has  a  multi¬ 
iteration  rate  input  processing-internal  processing  operation,  the  communi¬ 
cation  setup  between  one  and  the  other  processings  probably  could  be  chosen, 
with  a  modest  hardware  saving,  to  use  the  same  rapid  access  memory  setup 
as  at  equal  rates  (though  with  program  scale  changes  of  inputs  from  input 
processing  QDPU  in  internal  computation).  However,  performance  is 
assured  by  supplementing  the  communication  rapid  access  memory  by  a  half 
doaen  4  bit  core  registers  for  accumulating  outputs  of  is.put  processing  in  a 
high  rate  loop.  A  QODA  with  drum  memory  can  be  set  up  for  the  multi¬ 
iteration  rates  by  alternating  print  Instructions  to  write  heads,  spaced  M 


vm-12 


ratio  of  delays  from  read  heads,  while  alternating  QDDA  modal  action  at 
half  the  lesser  delay  interval  of  that  of  input  processing  and  internal  com¬ 
putation  lines. 

8*  S  MODAL  TYPES  SUMMARY  -  The  following  modal  types  discussed  in 
detail  in  previous  paragraphs  are  developed  for  the  full  scale  computer: 

A.  Iteration  rate  mode  is  either  high  or  intermediate  accordingly 
as  the  QDPU  number  ie  <  14  or  >  14.  Input  processing  is  pre¬ 
sumed  to  involve  <_  14  QDPU  (consistent  with  application  study). 

B.  Arithmetic  mode  is  single  or  double  precision  for  any  QDPU 
according  to  the  arithmetic  mode  programming  bit.  Generally, 
input  processing  uses  double  precision  at  high  iteration  rate. 
Certain  error  sensitive  calculations  such  as  sinusoid  calculation 
on  external  input  angles  may  be  computed  with  required  precision 
at  intermediate  rate  using  double  precision. 

C.  Internal  Communication  Modes  of  the  QDPU 

1.  Transfer  allocation  modes  are  those  selected  in  QDDA  com¬ 
putation  programming  studies  and  delineated  in  the  pro¬ 
gramming  code  analysis. 

2.  Decision  command  mode  is  automatic  in  one  channel  of 
QDPU  used  also  for  input  processing.  Accordingly,  as  a 
programmed  variable  exceeds  a  given  constant  (as  deter¬ 
mined  by  the  sign  of  a  given  6-register),  the  output  is  the 
decision  command  signal  1  or  0  used  in  decision  response 
modes.  Up  to  14  independent  decision  command  signals 
can  be  generated  without  cost  in  overall  processing  rate. 

3.  Decision  response  modes,  according  to  the  two  decision 
response  programming  bits,  interpret  two  or  more  inputs 


VIII- 13 


to  a  given  6  register  ($m  register)  as  a  decision  signal  or 
normal  variables  input  depending  upon  whether  or  not  the 
source  was  the  second  channel  of  the  first  14  QDPU.  The 
updating  is  in  normal  fashion  with  normal  variables  and 
the  interpretation  of  the  decision  command  signal  (a  norm- 
ally  programmed  input)  is  according  to  programming  bits, 
and  the  decision  response  programming  bits,  of  the  QDPU 
for  which  the  actions  are  one  of  the  following: 

(a)  Transfer  Conditional  Inhibit  ion  of  6  to  y  ,  thus 

m  m 

enabling  function  limiting. 

(b)  Update  Conditional  Cutoff  of  6^  to  yT  and  unconditional 

update  of  y_  by  4  . 

xn  m 

(c)  Transfer  Conditional  Sign  Change  of  6  to  y  . 

m  in 


CHAPTER  IX 

DDA  AND  QDDaA  SIMULATIONS  ON  THE  IBM  704 
COMPUTER  AND  PRIMARY  RESULTS 

9.  0  OBJECTIVES  AND  PRIMARY  RESULTS  ON  THE  DDA  AND  ODD*  A 
SIMULATION  EFFORTS  -  Approximately  one  third  (11  hours)  of  the  total 
allocated  programming  hours  on  the  IBM  704  for  the  Phase  II  effort  was 
utilised  in  simulations  of  ordinary  and  elaborated  conventional  ternary  DDA 
and  of  the  ODD  A.  The  broad  objective  of  this  effort  was  to  assist  in  the 
development  and  evaluation  of  incremental  computer  designs  for  internal 
computation.  Internal  computations  account  for  the  majority  of  the  integrators 
utilised  in  typical  aerospace  application  programs,  but  generally  do  not 
involve,  in  fullest  measure,  the  special  computation  problems  of  input 
processing  explored  during  Phase  I.  The  full  aerospace  mission  has 
associated  programs  requiring  a  DDA  with  a  capacity  of  several  hundred 
precision  integrators,  which  implies  that;  while  the  individual  internal 
computation  routine  might  not  in  all  cases  be  challenging  nevertheless  the 
overall  computation  task  does  challenge  the  most  sophisticated  existing 
DDA  computers.*  The  reduction  of  mechanisation  complexity  in  the  com¬ 
puter  system  designed  to  handle  input  processing  and  internal  computation 
was  ultimately  obtained  through  a  single  time  shared  basic  processing  unit 
(generalised  integrator),  which  basically  implies  a  need  for  further  increased 
internal  computation  capacity  (as  the  result  of  the  time  sharing  feature). 

The  choice  of  specific  simulation  efforts  in  internal  computer  design  is  made  most 
efficient  by  concentrating  effort  on  those  types  of  internal  computations  tnat 
occur  in  aerospace  applications  and  that  present  the  basic  source  of  limitations  of 


♦  The  goal  of  reducing  GP  computer  mechanisation  requirements  in  a  GP-DDA 
system  to  a  fraction  of  that  required  in  previous  aerospace  systems  requires 
for  internal  computation  a  DDA  of  >250  integrator  capacity. 


IX- 1 


computation  capacity  of  existing  DDA  computers.  Aerospace  applications 
were  shown  to  imply  the  capability  of  executing  computations  involving  divi¬ 
sion.  The  lack  of  sufficient  quotient  capability  is  one  of  the  two  basic  limi¬ 
tations  of  the  conventional  DOA.  Division  operations  are  generally  recognised 
to  present  severe  accuracy  limitations.  The  major  previous  real  time  applica¬ 
tion  of  the  DDA,  airborne  (pure)  inertial  navigation,  has  generally  avoided 
division  altogether  by  making  programming  use  of  the  narrow  altitude  and 
velocity  range  of  the  conventional  aircraft.  The  lack  of  precision  division 
capability  is  becoming  apparent  in  the  latest  airborne  doppler  damped  inertial 
navigation  systems  where  long  term  navigation  accuracy  is  sought.  Precision 
division  capability  is  a  major  requirement  in  full  aerospace  applica¬ 
tions.  The  second  basic  limitation  of  the  conventional  DDA  in  internal 

computation  is  common  to  the  fundamental  limitation  of  input  processing,  as 
it  is  directly  related  to  rate  limit  and  resolution  limitations  that  generally 
lower  overall  computation  capacity  and  actually  accentuate  the  division  oper¬ 
ation  limitations  as  well  as  the  other  typical  computations  that  are  applications 
of  integration  in  the  DDA.  In  recognition  of  these  two  basic  limitations  of  exist 
ing  DDA  computers,  the  programming  effort  allocated  to  DDA  and  ODD  A  simu 
lation  studies  was  directed  primarily  toward  investigations  of  division  and 
multi -increment  DDA  design  techniques  and  their  evaluation.  Simulations  of 
sinusoids  were  also  executed  to  complement  efforts  of  phase  one.  The  most 
important  simulation  result  of  the  overall  effort  was  verification  of  the  QDDaA 
multi-increment  quotient  algorithm  developed  during  the  second  phase  of  the 
program.  A  second  result  may  potentially  have  value  in  slightly  elaborated 
conventional  DDA,  but  could  not  be  thoroughly  simulation  evaluated  within  the 
scope  of  this  effort.  This  result  was  a  discovery  of  a  technique  for  improved 
digital  Stieltjes  integration.  Realisation  of  this  technique  could  improve  quo¬ 
tient  capability  of  a  near  conventional  DDA.  As  an  aerospace  full  mission 
computation  program  executed  largely  by  QDDaA  implies  a  very  large  step  in 


IX-?. 


increased  computation  capacity  in  relation  to  existing  DDA  computers,  the 
major  analytical  development  of  design  techniques  for  a  multi-increment 
quotient  algorithm  DOA  was  chosen  as  the  primary  simulation  evaluation 
program  objective  during  the  limited  period  prior  to  the  required  program¬ 
ming  effort  for  evaluation  of  the  strap-down  processor  constructed  during 
Phase  2.  This  development  occurred  mid-course  during  Phase  2.  Success¬ 
ful  simulation  of  the  computer  that  has  been  referred  to  as  the  QDD^A  has 
confirmed  that  a  major  breakthrough  in  DDA  design  technique  was  accomplished. 

9.  1  PROGRAM  FOR  SIMULATION  OF  DDA  COMPUTATIONS  AND  THE 
MODIFIED  PROGRAM  FOR  QDD*A  COMPUTATIONS 

A.  Programming  Approach  -  Simulation  of  computations  in¬ 
volving  DDA  integrator  ensembles  with  elaborated  DDA  in¬ 
tegrator  designs  involves  programs  that  may  be  basically 
closely  related,  but  which,  if  not  provided  for  in  the  pro¬ 
gramming  approach,  are  not  readily  altered  from  one  simu¬ 
lation  to  the  next.  To  minimize  the  programming  effort  in 
preparation  for  successive  simulations,  a  program  for  gen¬ 
eral  integrator  ensembles  of  DDA  integrators  with  all  the 
alternative  algorithm  features  was  developed.  Thus,  a 
single  program  modification  card  could  supply  the  format 

for  the  next  simulation.  When  the  latter  QDDA  was  analytically 
developed,  the  problem  of  programming  for  simulation  eval¬ 
uation  was  obviated  by  altering  the  DDA  program  to  treat  a  set 
of  DDA  integrators  storagewise  as  a  single  QDDA  and  incorpo¬ 
rating  the  several  basic  changes  of  operation. 

B.  Program  Structure  for  Simulation  of  an  Arbitrary  Computation 
by  a  System  of  DDA  Integrators  (with  mixed  Higher  Order  Algo¬ 
rithms,  Derivary  Communication,  Roundoff  Reduction  and 


IX- 3 


Special  Initialization  Feature*)  -  A  program  structure  was 
developed  for  general  OOA  system  simulations.  The  proces¬ 
sing  of  a  single  ODA  integrator  and  execution  of  its  com¬ 
munications  through  a  single  iteration  of  an  arbitrary  com¬ 
putation  can  be  completely  defined  in  the  following  general 
manner: 


1.  State  a  set  of  numbeis  which  define: 


Update 

each 

Iteration 


r  a.  State  of  the  integrator  before  proceseing 

b.  The  incoming  communicatione  to  the  integrators  for 
'  that  iteration. 

r  c.  The  special  features  of  operation  of  that  particular 
integrator,  namely,  integration  algorithm,  roundoff 
reduction  features. 


Fixed 
Throughout 
a  Simulation 


d.  Identity  of  the  other  integrators  to  which  the  said 
integrator  communicates,  and  the  type  of  utilization 
by  the  integrator  communicated  to  (i.  e. ,  integrand 
and/or  independent  variable)  and  the  sense  of  utili¬ 
sation  of  these  communications  (i.  e. ,  direct  or  with 
t  sign  reversal). 


Fixed 
Subroutine 
for  all 
Simulation* 


2.  Subject  the  set  of  numbers  described  in  1  to  a  subroutine 
r  (called  the  integrator  processing  subroutine)  which  com¬ 
putes  (using  la,  lb,  and  lc).  a)  the  new  integrator  state, 
then  updates  la;  and  b)  the  new  integrator  outputs,  then 
(using  1(d),  updates  1(b)  of  the  appropriate  other  integra¬ 
tors.  The  subroutine  will  at  its  end  replace  the  input  com¬ 
munication  runs  for  "immediate  use"  (by  then,  already  used) 


DC-4 


with  the  input  communication  turns  for  "subsequent  use" 
(to  be  used  at  the  next  iteration,  when  further  communi¬ 
cations  have  been  summed  into  it). 

A  program  capable  of  carrying  out  an  arbitrary  computation 

involving  integrators  is  obtained  as  follows: 

3.  Storage  blocks,  N  in  number,  are  allocated  in  the  rapid 
access  memory.  Each  storage  block  contains  informa¬ 
tion  of  la,  lb,  lc,  and  Id  associated  with  a  single  inte¬ 
grator.  The  initial  values  of  the  numbers  in  each  stor¬ 
age  block  are  inserted  in  the  rapid  access  memory  prior 
to  simulation  start  by  means  of  cards  or  tape. 

After  program  start,  the  rapid  access  storage  blocks  are 
updated  during  execution  of  the  program  (i.  e. ,  la  and  lb) 
numbers  are  updated  in  the  manner  indicated  below. 

4.  The  operations  of  2  are  incorporated  in  the  integrator 
processing  subroutine. 

5.  Lists  of  the  specific  data  types  and  operation  of  I  and  U 
are: 

a.  Integrator  data  storage  block  information: 

(1)  y-register  content  (whole  word) 

R -register  content  (whole  word) 

(2)  Input  communications  summed  in  a  space  for 
immediate  iteration  use  and  next  subsequent 
iteration  use,  respectively,  for  each  quantity: 

Independent  variable  communication  total: 

1st  order  for  immediate  use:  UK 

1st  order  for  subsequent  use  EdK 


IX- 5 


2nd  order  for  immediate  use: 
2nd  order  for  subsequent  use: 


EDX 

IDx 


Integrand  variable  communication  total: 


1st  order  for  immediate  use: 
1st  order  for  subsequent  use: 
2nd  order  for  immediate  use: 
2nd  order  for  subsequent  use: 


Uy 
Z  Dy 

ZDy 


At  the  end  of  integrator  process  the  subsequent 
sums  replace  the  " immediate  use"  sums. 

(3)  Si  Integration  Algorithm  scale  factor  for  1st 

order  term  (whole  word) 

Ss  Integration  Algorithm  scale  factor  for  2nd 
order  term  (whole  word) 

Tqi  Test  number  used  in  decision  for  overflow 
inhibition  and  Reregister  reset  to  l/2  (whole 
word) 

(4)  llt,  F^n<*|rs  1,  2,  3,  4):  Factor  by  which 
associated  communication  is  multiplied  before 
being  sent  to  other  integrator  data  block. 

C  U\  Cy2n<*  (r  *  1,  2,  3,  4):  Storage  places  (in 
other  integrator  data  blocks)  for  presently  pro¬ 
cessed  integrator  outputs  (1st,  2nd  order  terms) 
to  be  sent  appropriately  for  correct:  (1)  Integrator 
(2)  Type  of  utilization  to  be  made  (integrand  or  in¬ 
dependent  variable)  (3)  Time  of  utilization  (as  soon 
as  the  integrator  is  processed  or  in  the  subsequent 
iteration). 


IX-6 


b.  The  integrator  processing  subroutine  carries  out  the 
following  operation: 

(1)  Y- register  updating:  Y^  =  Vn  i  + 

(2)  Integration  algorithm  updating  of  R- register  with 
overflow  inhibitor  test  feature: 

Reset  R- register  if  ly^l  ‘‘■Tqj  to  R^  *  l/2 


Otherwise, 

AR 


VSl  £ 

Ac-1 


£. 

c*  1 


D 

y  1 


is  added  to  R  .to  obtain  R  * 
n-  i  n 

If  R  *  >  1,  then  take  R  *  R  •  -  1 
n  n  n 

R  *  ''•I,  then  take  R  *  R  *  +  i 
n  n  n 

•1<  R  1,  then  take  R  *  R  • 

n  n  n 

(3)  Overflow  to  be  communicated  as  1st  order  terms: 

If  |  y  |  -i  T„,  then  No  overflow:  O  *  0 
n  OI  —  n 

Otherwise  if: 

R  *>  1,  then  take  O  =  t 
n  n 

R  *s  -1,  then  take  O  *  -  i 
n  n 

(4)  Communicated  2nd  order  terms 

D_  *  8  i  4k  °  L  Ay 
n  n 


(IX  -1 


(IX-2) 


(1X-3) 


IX -7 


(5)  Scaling  using  F -factor*: 


F 

r 


lat 


0 

n 


*  Ax,  Ay;  F  2nd,  D  * 
r  n 


(IX-4) 


(6)  Replace  contente  of  input  common  for  "immediate  uee" 

( already  used)  with  contente  of  input  common  for  "  subse¬ 
quent  use." 


Simulations  of  Reciprocal  Calculation  by  Conventional  and 
Digital  Stieltjes  DDA  and  the  QDD*A 

1.  Program  Features  -  It  is  generally  recognised  that  a  con¬ 
ventional  DDA  executes  reciprocal  calculations  (and  more  com¬ 
plex  ope  rations  involving  division)  with  poor  accuracy  in  com¬ 
parison  to  most  other  computations.  Alternative  programs 
employ  integrators  alone  (no  servo  in  Ambles  method)  in  the 
one  case  and  the  program  for  implicit  calculation  with  inte¬ 
grators  anda  servo  in  the  other  case.  The  alternative  methods' 
are  recognised  to  result  in  different  detailed  error  properites 
but  generally  the  same  error  magnitudes.  The  goal  of  this 
study  effort  was  the  development  of  a  precision  computer. 

The  conclusion  that  servos  involve  inherent  errors  from  lag 
effects  and  servo  mechanism  properties  led  to  placing  pri¬ 
mary  study  emphasis  on  integrator  systems  without  the  use  of 
servos.  It  will  be  seen  that  the  primary  error  affects  in  re¬ 
ciprocal  calculation,  as  contrasted  to  other  calculations, 
result  from  imperfect  digital  Stieltjes  integration  algorithm. 
The  reciprocal  calculation  by  a  DDA  provides  a  test  calcula¬ 
tion  for  the  fundamental  process  of  integration  with  respect  to 
a  variable  other  than  full  rate  where  single  transfer  effects  the 
operation. 


To  evaluate  conventional  DOA  with  ordinary  and  elaborated 
algorithms,  short  run  tests  were  devised  so  as  to  be  extra* 
ordinarily  severe.  This  was  accomplished  by  generating 
high  frequency  inputs  with  large  amplitude  excursions  re¬ 
quiring  that  integrator  register  lengths  be  v*  ry  short.  The 
test  input  function  was,  in  all  cases,  of  the  form 

In  «  A  +  B  sin  «on  (IX-5) 

for  which  the  ODA  was  allocated  the  task  of  computing  l/l^- 
For  B  >  0  the  input  function  represents  generally  a  partial 
(less  than  full)  rate.  For  general  A,  B  the  effects  of  oscilla¬ 
tion  could  be  evaluated. 

The  classical  type*  algorithm  for  reciprocal  calculation  may 
be  derived  by  either  of  two  methods:  (1)  Exact  numerical 
difference  relations  (2)  Numerical  integration  algorithm. 

While  all  algebraic  computations  such  as  the  reciprocal  cal¬ 
culation  may  be  assigned  an  exact  numerical  difference  re¬ 
lation,  differential  equation  solution  computations  are  dir¬ 
ectly  analyzed  in  terms  of  numerical  integration  algorithms. 
Algebraic  equations  when  differentiated  produce  differential 
equations  which  have  algorithms  for  solution  directly  ana¬ 
lyzed  in  terms  of  numerical  integration  algorithms.  The 
integration  algorithm  is  therefore  general  in  applicability  to 
algebraic  and  differential  equation  solution  whereas  the  direct 
difference  relation  approach  is  not.  Any  computation  may  be 
executed  using  appropriately,  for  each  integrator  of  the  system, 
a  particular  one  of  only  two  integration  algorithms  (both 
occurring  in  the  program).  A  computer  which 

*  Based  on  numerical  rather  than  digital  processes. 


IX-9 


perform •  exact  difference  equation  solutions  and  also  the  integra- 

tion  algorithms  would  formally  require  mechanisation  for  four 

programmable  algorithms  for  second  order  accuracy.  Consider 

the  exact  difference  relation  for  reciprocal  calculation 

9  =  1/1  as  obtained  in  the  two  steps  of  differencing  9  I  . 

n  n  n  n 

A(8  I  )  =  9  ,  AI  +  I  &  6  =0  (IX-6) 

n  n  n-1  n  n  n 

and  substituting  1^  =  1/9^  and  solving  for  £>  6^  to  obtain 

A  6  =-99  ,  hi  (IX-  ) 

n  n  n-1  n 

an  exact  difference  relation.  Next  consider  the  differential  of  91 

d  (o  I)  =  9dl  +  Idfi  (IX-81 

in  which  substitute  1  =  1/6  and  solve  for  d  9  to  obtain 

d  9  =  -  9a  dl  (IX-9) 

An  exact  difference  is  formally  derived  by  integrating  over  an 
interval  from  (n-1)  t  to  n 

n  T 

ae  ■  r  edx (*)  (ix-io) 

(n-l)r 

whe  re 

t 

x(t)  =  r  9  dl 

(n-l)r 

Since  9  is  available  only  at  t  =  (n-1)  t  and  before^  the  lag  correc¬ 
tion  integration  algorithms  in  virtual  variables  is  appropriate, 
hence  to  second  order 


(1X-11) 


where 


Ax  = 

n 


n  r 

=  f 


0  d  I . 


The  same  consideration  implies  the  computation  of  Ax^  by 

Ax  =  ?  +iAc  ,  +  ~  A8  0  .  AI 

n  |_  n-1  2  n-l  12  n-lj  n 


(IX-12) 


That  the  computation  in  terms  of  integration  algorithm  agrees 
with  the  computation  with  exact  difference  relation  to  first  order 


is  deduced  by  substituting  Ax  in  the  A  3^  relation: 

as  =  e  ,  +  it  s  *  - 1 

n  ln-1  2  n-l  12  nj  n 

-  (P  .  +  7  A  e  ,a  +  *  (r  ,)(~A  )|  AI 

n-l  2  n-l  I  n-l  n-l  2  n-l  1  n 


(IX-131 


~  ,  +  a?  ,)  ax 

n-l  n-l  n-l  n 


—  0  ,  c  AI  =  A  '  exact  difference 

~  n-l  n  n  n 

If  account  is  made  for  second  order  differences  of  virtual  and 
desired  variables  the  agreement  is  good  to  second  order. 

The  simulated  ODA  configuration  for  reciprocal  calculation 
is  indicated  in  Figure  9-1. 


IX- It 


Integrator  No.  1 


Figure  9-  1.  Schematic  of  Simulated  DDA  Reciprocal 

Calculation 


Programmable  integration  algorithm  (apart  from  elaborations 
developed)  was  of  die  form 


Ax  *  F®  +  sj1*  A®  ♦  S*1*  A*  9  1  A1  Integrator 
|_  ■  n  n 

A  9  *  £0^  +  S4W  A  8  +  S»***  A*  e  J  A  x^  Integrator  No 


No.  i  ax-u) 

.  2  (IX- IS) 


Initial  Single  Increment  DDA  Reciprocal  Calculation  Runs  • 
Initial  runs  simulated  a  conventional  single  increment  DOA  with 
alternative  integration  algorithms  of  the  classical  form.  The 
reciprocal  calculation  to  be  executed  was 


e  ,  - -  (IX-16) 

"  0.50  ♦  -r^  sin(2"*nj 


which  in  a  single  increment  DOA  may  be  carried  out  with 
register  lengths  of  five  bits,  at  most,  using  Amble's  method. 
For  a  DDA  iterating  at  200  it/sec,  the  test  calculation  involves 


frequency  components  at  about  1  cps-  Initialization  of  R  and  y 
registers  was  selected  with  alternative  subsignificant  biases  to 
test  the  effect  of  small  perturbations.  Algorithms  tested  were 
the  same  for  integrators  no.  1  and  2  and  included  the  cases 

Run  ( 1)  Si  =  1/2,  Sa  =  -j— •  Second  Order  Algorithm 

Run  (2)  Si  =  l/2,  Sa  *  0  First  Order  Algorithm 

Run  (3)  Si  =  0,  Ss  =  0  Lagged  (Zero  Order)  Algorithm 

Run  (4)  Si  =1,  S8  *  1  Led  (Zero  Order)  Algorithm 

In  these  runs  the  communicated  4*y  was  used  as  the  algorithm 
u y  directly  and  the  derivary  as  u  y.  All  runs  (which  were 
checked  for  programming  accuracy)  demonstrated  errors  in 
excess  of  the  granularity  in  500  iterations  and  errors  of  the 
order  of  the  variation  of  O^  in  2000  iterations.  An  approximate 
graph  of  results  and  the  desired  computed  function  are  pre- 
sented  in  Figure  9-2.  The  general  effect  consistently  ob¬ 
served  is  an  incorrect  attenuation  of  the  computed  function 
amplitude.  The  first  three  runs  were  executed  first. 

The  first  two  runs  have  lag  correction  numerical  integration 
algorithms  of  second  and  first  order  accurav  and,  in  a 
whole  word  (rather  than  single  increment)  incremen¬ 
tal  computer,  should  yield  accurate  results.  The  fact  that 
large  errors  resulted  is  a  consequence  of  the  single  transfer- 
single  increment  mechanization  simulated.  Sinusoid  com¬ 
putations  by  single  increment  DDA  yield  good  results  with  the 
second  order  and  first  order  algorithms  (lag  adjusted  for 
serial  or  parallel  computation)  and  poor  results  for  lagged  or 
led  zero  order  algorithm.  The  DDA  reciprocal  calculation 
results  for  Run  3  were  somewhat  poorer  for  the  lagged  al¬ 
gorithm.  Analysis  of  whole  word  computation  of  reciprocal 


IX- 13 


200  600  1000  N  1400 

Figure  9-2.  Simulated  Conventional  DDA  Calculation  of  Reciprocal  using  Amble's  1 

Integration  Algorithm  Classically  Mechanized 


T  rue 


Classically 

Mechanized 

Algorithms 


1400 2000 


Reciprocal  using  Amble's  Method: 
lechanized 


Sl=  .5,  S2=  5/12 
SI  =  0  ,  S2  =  0 

Si=  1,  Ss  =  1 

a  .  J.  =  -Lr _ B 

I  2+  7/8  Sin  2-5  n 

Yo  =  .50292997 
Ro  =  .5 

Ro  =  .987306875 


IX-  14 


calculations  as  well  as  simulation  of  single  increment  com¬ 
putation  indicates  that  lag  will  produce  oscillation  amplitude 
attenuation  for  the  reciprocal  calculation  with  inputs  having 
an  oscillatory  component.  The  analysis  of  the  whole  word 
computation  case  indicated  that  a  perturbation  of  the  ob¬ 
served  error  magnitude  could,  for  the  selected  type  of  input 
to  reciprocal  calculation,  be  generated  by  a  one-half  itera¬ 
tion  lead.  Run  4  was  therefore  selected  to  determine  the 
degree  of  improvement  in  the  single  increment  DOA  using 
an  incorrect  numerical  integration  algorithm  which  should 
compensate  for  another  error  source.  Results  showed  slight 
improvement  but  not  of  the  degree  called  for  on  the  numerical 
computation  basis.  Runs  were  made  with  the  same  set  of  in¬ 
tegration  algorithms  but  with  different  initial  R  register  set¬ 
tings  of  R  >0.5  and  0.  75  and  slightly  perturbed  (subsignifi¬ 
cant)  initial  y  register  values.  Results  were  little  changed  in 
the  different  runs.  It  was  concluded  that  the  error  effects  in 
digital  Stieltjes  integration  with  single  increment  are  not  solely 
compensatable  by  any  limited  modification  of  classical  integra¬ 
tion  algorithm  (mechanised  in  the  conventional  manner  for  DDA) 
from  that  called  for  by  theory,  but  rather  require  a  more  subtle 
basis  in  mechanisation  that  should  be  explored  in  greater  detail. 
That  the  error  effects  are  attributed  to  the  mechanics  of 
Stieltjes  integration  (independent  variable  not  time)  was  a  de¬ 
duction,  of  course,  based  on  the  fact  that  other  simulations  of 
single  increment  DDA  which  involved  integrations  with  time 
independent  variable  led  to  good  results  in  all  cases  consistent 
with  the  theory  of  numerical  integration.  Detailed  study  of  the 
micro-aspects  of  the  runs  were  focused  on  the  reason  for  the 


DC-15 


failure  of  transmission  of  integration  algorithm  terma. 

3.  Reciprocal  Computation  by  DDA  Integrator  Setupa  Other 
Than  Amble's  Method  -  The  poor  performance  in  reciprocal 
calculation  by  conventional  ODA  with  conventional  integration 
algorithma  effected  by  typical  mechanization  approachea  ■  timu- 
lated  the  aelection  of  a  aet  of  alternative  integrator  8etups  and 
algorithma  (the  latter  with  Amble's  method)  for  evaluation  in 
a  parallel  effort.  Because  promising  results  were  obtained  by 
tne  concommitantly  developed  digital  Stieltjes  algorithm  in 
Amble's  setup,  the  two  alternative  integrator  ensembles  then 
being  programmed  were  not  thoroughly  evaluated  beyond  a 
single  run.  One  of  the  approachea  gave  relatively  good  results 
compared  to  Amble's  method  with  the  same  conventional 
algorithm  mechanics,  but  results  were  not  comparable  to  Amble's 
method  with  the  new  algorithm  (later  developed).  It  is  possible 
that  if  the  new  algorithm  were  adopted  in  this  alternative  compu¬ 
tation  for  reciprocal  calculations,  results  might  be  better  than 
with  Amble's  method  ii.  very  much  longer  term  operation  than 
that  carried  out  i r  evaluating  the  new  algorithm.  The  devised 
alternative  reciprocal  computation  methods  and  run  parameters 
are  indicated  in  the  diagrams  on  the  next  page.  The  second  method, 
which  may  be  called  the  servoed  Amble's  method,  was  not  pro¬ 
grammed  correctly,  nor  was  good  choice  of  the  parameter  X 
estimated  ou  any  quantitative  basis.  The  concept  of  the  cor¬ 
rection  term  X  (OdI  +  IdO),  which  should  be  zero  when  O  *  l/l, 
is  that  of  servoing  accomplished  by  integrators  rather  than 
operational  integrators. 


Alternative  reciprocal  calculations  consisting  of  the  Square  and  Integrate 
Method  and  Servoed  Amble's  Method  are  shown  in  Figures  9-3  and  9-4. 


1# 


2# 


I*  2  + 


2 

8 


in  2"6n 


Figure  9-3.  Alternative  Reciprocal  Calculations 


Notation  and 

Differential 

Equations 


•*. 

-Sdl 

2ada 


i/l 


Run  Results:  Mean  absolute 
error  in  9  of  0.  08  after  1800 
iterations  (about  1/6  the  error 
rate  in  Amble's  method). 


Integrations  r 
Algorithms  J  1 
(Classically  jsf 
Mechanised) 

Initial  Register/yi 
Values  ^ 


0.  5.  Se  >  1/ 12 

-0.  5,  Ss  *  -1/12 


-0.25  - 
♦  0.  50  ♦ 


0.  50  -  2 


-10 


0.  50  - 


IX- I  7 


(2)  Se  rvoed  Amble' a  Method 


Differential  Equation: 

do  •  -0*  dl  -  X 

(Linear  Servo  term  <7dJ  +  Ida) 


Integration 

Algorithm 

(Claeeically 

Mechanised) 


si  •  +1/2.  Se  *  +5/12 
•  S?  •  + 1/2,  So  >  +5/12 
teSj  »  -1/2.  Se  «  -1/12 


Initial  Register  • 
Values  ® 


yx  »  0.52292997 
ye  -  0.50292997 
ye  •  0.5 
Rx  *  0. 5 
Re  *  0. 5 
Re  •  0. 5 


DC- 18 


4.  Reciprocal  Calculation  by  Single  Increment  ODA  with  Elabor¬ 
ated  Algorithm  and  Derivation  of  Digital  Stieltjes  Integration 
Algorithms  - 

a.  Initial  Efforts  in  Algorithm  Development  -  In  the  initial  runs 
of  single  increment  DDA  reciprocal  calculation,  an  observed 
lack  of  response  to  first  (and  higher)  order  integration 
algorithm  change  in  quantitative  degree  called  for  by  die 
theory  of  numerical  integration  was  interpreted  to  mean  a 
lack  of  transmission  to  the  output  of  the  DDA  integrator  of 
first  (and  higher)  order  algorithm  terms.  Both  DDA  integra¬ 
tors  in  the  reciprocal  calculation  by  Amble's  method  are 
assigned  functions  of  what  may  be  called  digital  Stieltjes 
integration,  the  independent  variables  not  being  full  rate. 

The  lack  of  algorithm  transmission  and  the  observation  re¬ 
garding  calculation  structure  characteristic  of  conditional 
transfer  in  Stieltjes  integration  served  to  assist  the  choice 
of  direction  of  die  detailed  examination  of  the  initial  simula¬ 
tion  runs.  Error  phenomena  were  highly  magnified  by  the 
choice  of  high  frequency  inputs.  It  was  observed  that  dia- 
phantine  phasing  of  iy  and  L x  variables  in  an  integration 
y£x  led  to  highly  sporadic  action  in  the  transmission  of 
first  order  terms.  While  transmission  may  be  sporadic, 
causing  noise  like  errors,  the  average  transmitted  value 
would  be  expected, with  proper  digital  algorithm,  to  correspond 
to  the  correct  whole  word  values.  The  idea  of  reducing  the 
degree  of  sporadic  action  by  using  a  smoothed  estimate  of 
:  y  was  considered.  A  slowly  changing  smoothed  value  will 


IX- '9 


be  transferred  any  time  a  Ax  is  non-zero.  A  smoothing 
calculation  which  is  readily  mechanized  is 


The  Ay  is  used  in  the  integration  algorithm  and  is  not  used  in 
the  y  register.  The  correct  numerical  integration  algorithm 
(with  Si  =1/2  and  Ss  =  5/12)  together  with  the  first  order 
term  using  Ay  was  selected  for  simulation.  The  only  dif¬ 
ference  in  the  simulated  DOA  from  that  of  run  (1)  was  the 
use  of  Ay.  Simulation  results  indicated  a  modest  improve¬ 
ment  over  the  least  poor  run  (Run  4)  of  the  initial  simula¬ 
tions,  the  magnitude  of  improvement  being  comparable  to 
the  improvement  of  Run  4  over  Run  1.  The  lack  of  really  a 
major  improvement  in  using  the  Aywas  interpreted  to  imply 
that  the  random  element  of  error  was  not  the  primary  error 
source. 


.  Derivation  of  Digital  Stieltjes  Integration  Algorithms  -  As  a 
result  of  feedback  (with  one  iteration  delay)  of  outputs  in 
Amble's  method,  this  method  when  applied  to  a  whole  word 
incremental  computer  rather  than  DDA  calls  for  the  precise 
application  of  numerical  Stieltjes  integration  algorithm  (in 
essentially  whole  number  incremental  computation)  in  terms 
of  virtual  variables  (refer  to  Chapter  Ill)  having  the  classical 
form. 


(1X-18) 


for  second  order  accuracy.  All  simulation  evidence  for 
single  increment  DDA  indicated  that  the  basic  error  source 


of  the  DDA  computation  must  stem  from  inappropriateness 
of  the  classical  digital  method  of  effecting  the  first  (and 
higher)  order  algorithm  terms  in  the  conventional  (or  class¬ 
ical)  manner  of  directly  using  the  transmitted  bit  with  half 
weight  to  represent  1/2  Ay^  in  the  algorithm.  Since  full 
rate  integration  (Ax^  =  At)  is  successful  with  this  direct 
representation  for  the  first  order  term, the  breakdown  is 

associated  with  the  fact  that  Ax  =  0  for  intervals  in  a  pulse 

n  • 

stream  representation  of  Ax^.  Preliminary  analysis  (ex¬ 
tended  to  include  the  complicating  factor  of  feedback-induced- 
lag  at  a  later  date)  implied  that  a  major  error  results  in 
algorithm  r<  ali cation  by  the  conventional  technique  by  not 
taking  into  account  that  the  transferred  quantity  effected  when 
a  x  *  0  actually  represents  the  integral  increment  over  the 
period  since  last  transfer.  On  this  basis  the  proper  algor¬ 
ithm  for  unlagged  y  variables  to  realise  first  order  digital 
Stieltjes  integration  is 

[v?  *  .  %14\  «x-1” 

where  n*  is  the  iteration  number  of  the  preceding  non-sero 
n 

xn>  as  seen  by  inspecting  the  schematic  in  Figure  9-5  and 
employing  the  trapesoldal  rule. 

In  the  reciprocal  calculation, feedback  of  y  induces  a  lag  of 
one  iteration  which  must  be  corrected  for.  The  discussed 
digital  Stieltjes  algorithm  viewpoint  calls  for  introducing 
a  lag  (especially  for  low  rate  phases  of  Ax)  which  may  be  a 
large  number  of  iterations,  as  in  the  example  of  the  schematic 
being  five  iterations  lag.  Simulation  evidence  clearly  showed 


1X-2I 


F  g.sre  9-S.  Iteration  Number  (n*  )  Preceding  Non-aero  X 

n  n 


that  further  lag  introduction  would  be  deleterious  isee  runs  (1) 
through  (4)  in  reciprocal  calculation  performance,  which  appeared 
to  be  contradictory  to  the  principle  of  digital  Stieltjes  integration 
described.  Resolution  of  the  apparent  contradiction  *is  seen  to  lie 
in  the  detailed  effects  of  the  feedback  of  y  delineated  only  by  a  precise 
statement  of  the  purpose  and  method  of  aigital  representation,  a  degree 
of  freedom  in  assignment  that  is  at  the  choice  of  the  designer.  This 
statement  and  implications  were  not  clearly  resolved  until  a  later  date 
but  are  presented  in  parallel  with  preliminary  efforts  so  that  simula¬ 
tion  results  and  theory  may  be  correlated  by  the  reader. 

Analysis  ol  the  source  of  the  Stieltjes  transfer  error  effect  is  based 
on  the  difference  between  proper  phasing  of  Ax  used  for  y  register 
update. 


1X-.V 


If  update  with  Ax  is  assigned  the  role  of  bringing  the  value  of  x  to 
the  instantaneously  correct  value  at  a  fixed  time  A  T  before,  then  if 
Ax,  a  singlt  increment  quantity,  were  programmed  for  transfer  use 
rather  than  y  update  ( i.  e.  at  another  integrator),  an  inconsistent  time 
phase  between  transfer  and  y  algorithm  value  is  implied.  This  is  be¬ 
cause  Ax  calls  for  transfer,  typically,  after  a  number  of  iterations  in 
which  no  transfer  occurred,  and  it  may  be  hypothesized  that  integrator 
outputs  sent  directly  to  a  y  register  update  exactly  for  6?  before  The 
integrator  output  necessarily  represents  the  integral  increment  over  the 
period  since  the  last  transfer  displaced  by  5  T.  The  proper  first  order 
algorithm  for  the  integral  increment  therefore  is  based  on  the  y  value 
at  the  middle  of  the  time  interval  since  the  last  transfer  displaced  o  T. 

In  the  reciprocal  calculation  the  y  register  receives  inputs  delayed  one 
iteration.  For  the  case*  6  T  *  t/2  T.  mechanization  and  programming 
are  based  on  the  criterion  of  making  resolution  l/2  the  least  bit.  For 
the  case  8f  *  0  the  updated  quantity  represents  the  true  variable  at  that 
instant.  In  either  case  chosen  for  design  purposes,  and  adhered  to 
consistently  in  design  and  programming,  the  value  of  6  T  is  less  than 
one  iteration  interval.  Digital  Stieltjes  integration  algorithm  for  y 
undelayed  (as  by  being  fed  Ay  from  a  previous  word  time  of  the  same 
cycle)  is  therefore  based  on  the  total  y  change  since  last  transfer 
denoted. 


*T  s  iteration  interval. 


IX -2  > 


(IX-20) 


(ZAy)n  =  z  --  ^ypn*  +  i 

r  n 

where  n*  is  the  iteration  number  of  the  last  transfer  (not  the  trans- 
n  th 

fer  of  the  present  n  iteration  if  any).  In  this  case  the  second  order 
digital  Stitltjes  integration  algorithm  for  unlagged  y  variable  is 

K  =  [yn'^(IAy)n  -IT  A  (!Ay>nJ  Axn  (IX-21) 

where  y  .  x  ,  are  virtual  variables.  In  the  case  of  y  variable,  lag- 
n  n 

ged  as  a  result  of  computation  of  y  during  the  last  iteration  cycle,  a 
new  algorithm  is  derived  as  follows:  (1)  The  above  stated  digital 
Stieltjes  integration  formula  is  interpreted  as  being  applicable  with 
properly  modified  estimates  of  y  ,  Ay^,  which  take  into  account 
lagging  effects.  (2)  The  estimates  of  unlagged  y,  Ay  must  take  into 
account  the  correlation  of  Ax  and  Ay  changes  which,  for  example,  in 
reciprocal  calculation,  will  be  seen  to  be  all  important.  Thus  using 
numerical  extrapolation  formulae  such  as 

(y  )  numerical  lag  corrected  est.  *  y  ,  .  ♦  Ay  *  A*y  (IX-22) 

7n  n  lagged  7n  n 

may  be  completely  inadequate  if  a  correlation  of  the  time  of  Ax  and 
Ay  changes  exists,  since  the  conventional  algorithm  implementation 
technique  is  tacitly  based  on  averaging  effects  for  a  total  ensemble 
rather  than  a  correlated  subset.  To  provide  a  basis  for  total  en¬ 
semble  averaging,  make  use  of  the  statement  thi>?  the  function  con¬ 
sisting  of  the  arithmetic  sum  of  Ay  changes  which  occurs  in  period 
between  Ax  f  0  occurrences  may  be  phased  without  sensitivity  to 
correlated  effects,  rather  reflecting  the  changes  primarily  of  genuine 
interest  in  y  function  represented.  Thus,  if  A  y  /  0  occurred  always 


1X-24 


k  iteration*  after  Ax  /  0,  the  sum  of  Ay  over  the  Ax  change  cycle 
can  be  the  same  number  at  all  phase*  relative  to  the  iteration  at 
Lx  i  0,  and  represent  a  constant  rate  of  a  represented  variable.  The 
analogue  of  this  criterion  is  filtering  of  an  assigned  frequency  in  a 
finite  memory  filter  by  averaging  over  the  period.  The  generality 
with  which  effects  of  correlated  times  of  Ax.  Ay  changes  occur  can 
be  removed  and  is  tnen  indicated  as  the  generality  with  which  digital 
Stieltjes  integration  algorithms  can  be  developed  for  various  computa¬ 
tion  applications.  In  order  to  derive  a  lag  correction  algorithm  from 
the  algorithm  stated  for  the  unlagged  Ayn>  Ax^,  there  should  ultim¬ 
ately  be  taken  into  account  that  the  latter  has  its  firmest  basis  where 
y  is  completely  independent  of  x.  In  this  case,  the  full  ensemble 
averaging,  which  gives  basis  to  the  generalization  of  classical  con¬ 
cepts  of  realization,  firmly  holds.  Assume  for  the  moment  the  ab¬ 
stract  situation  that  the  next  iteration  results  at  n  ♦  1  are  known  at  the 
nth  iteration.  Then  the  lag  correction  algorithm  based  on  the  algorithm 
without  y  lag  should  ideally  compute  for  approximate  second  order  al¬ 
gorithm  (where  second  difference  with  1/  12  coefficient  is  neglected) 


AI„  *  tYml  -  \ 

Ideal 


M+ 1  .  b+ 1 

I  AY  -  «■  A L  DYp  j  AXn 

n 


neglected 


Lag  corrected 
Algorithm 


Yn  +  AYn+  1 
2 


AYn  *♦  1 


A  AYp  -  aY  AYp  1  A> 
P*n_*t 1  *  3  p«n  *+2 
n 


(IX-23) 


(IX-24) 


IX-25 


steps  two  and  three  involving  identitiea  including 


Ay 


n  ♦  1 


AyJ  ♦ 


♦  A 


/  n  ♦  1 

Syp 

^  p  *  n*n  ♦  2 


) 


(1X-26) 


and  the  A(  )  operation  n  aum  ia  meant  for  1  iteration  difference, 
not  (n-n*  )  iterations. 


The  only  unrealizable  term  ia  the  last  one  (of  the  third  step). 
Employing  the  principle  of  filtering  of  effects  with  period  (n-n*^) 
iteration  (where  at  n  iteration  a  transfer  is  assumed  to  have 
occurred),  the  approximation 


(IX-27) 


should  hold  in  ensemble  averaging  to  second  order.  In  this  case 
the  lag  correction  digital  Stieltjes  integration  algorithm  for  ap- 
proximate  second  order  accuracy  is 


AI 

n 


1  **  1 
•  4  EAy„  *  i 


y„  *  Ay„. 


n  ♦  1 


p  *  n* 


nil 


(IX-28) 


should  hold  to  the  accuracy  level  of  the  algorithm  for  unlagged 
y  In  evaluating  the  algorithm  for  unlagged  y  in  the  case  where 
Ax,  Ay  pulse  stream  phase  correlation  exists,  it  should  be  com¬ 
mented  that  a  certain  degree  of  correlation  would  be  presumed  to 


*The  second  order  term  presents  the  same  transmission  problems  by  conven¬ 
tional  representation  of  Ay  as  in  full  rate  integration.  A  generalization  of 


use  of  derivary  terms  developed  for  the  latter  can  in  principle  overcome  this 
problem,  be  effected  by  replacing  A 


\by  E  D„ 


where  D 


is  derivary 
IX -26 


c. 


be  part  of  the  proper  representation  of  the  variables.  If  a  phasing 
led  to  error,  it  might  be  proper  to  initialize  the  R  register  to  shift 
phase  to  better  digital  representation.  The  most  important  applica¬ 
tions  appear  to  involve  feedback  with  delay  for  which  the  lag  correction 
digital  Stieltjes  derived  should  not  introduce  errors  by  virtue  of  the 
lag  itself. 


Reciprocal  Calculation  Runs  for  Single  Increment  DDA  with  Digital 

Stieltjes  Integration  Algorithm  -  The  principles  of  digital  Stieltjes 

integration  discussed  in  the  preceding  section  were  not  fully  reconciled 

at  the  times  of  the  simulations  of  si-gle  increment  DDA.  It  vas  felt  initially 
n 

that  the  quantity  ]>  Ay  played  an  important  algoritnm  role 

i  p 

in  the  digital  Stieltjes  integration.  Empirically  and  analytically,  it 
was  determined  that  the  error  properties  of  conventional  DDA  could 
be  cancelled  by  further  increasing  lag  correction.  The  combination 
of  these  considerations  led  to  simulations  using  the  algorithm, 

|*n  *  1  yn  *  12  ?  *  n*p  +  (IX-29) 


for  the  lagged  y  variables  (Dr  being  the  derivary).  Note  that  Am¬ 
ble's  method  in  a  DDA  implies  that  an  input  to  each  y  register  can 
occur  one  iteration  after  transfer  and  no  other  time  and  only  when 
a  transfer  occurred  in  both  integrators.  Thus  in  Amble's  method 
the  algorithm  stated  above  is  exactly 


(IX -30) 


1X-27 


n 


again,  note  that  the  lag  correction 


Using  £  Ay  =  Ay 

p=n*+lp  nn+l 

n 

digital  Stieltjes  integration  algorithm  derived  in  the  preceding  section 

is  r  ~1 


*  i  *  i  AE  . 

n  +  1  p  =  n+ 


n  +  1 


Ax  (IX -31) 


Using  the  identity  (where  A  (  )  operator  on  sum  is  a  1  iteration  dif¬ 
ference) 


(z  s  0  •  « 


a 

A  y. 


(IX- 32) 


and  identifying  A2y^  with  the  derivary,  it  is  seen  that  the  al¬ 
gorithm  simulated  in  reciprocal  calculation  is  identical  to  the 
analytically  derived  lag  correction  digital  Stieltjes  algorithm  (in 
which  Aay  is  represented  by  D)  to  within  approximately  second 
order  accuracy*  Runs  were  made  for  the  two  different  input  func¬ 
tions,  differing  in  degree  of  oscillatory  component.  The  derivary 
terms  were  omitted  in  the  third  run.  Detailed  parameter  values 
are  presented  on  the  next  page,  which  also  defines  the  smooth  first 
difference  run  made  previously.  Results  were  excellent,  being  a 
fraction  of  an  increment  error  over  2000  iterations  using  the  digital 
Stieltjes  integration  algorithm.  A  measure  of  assurance  that  the 
results  were  not  an  accidental  cancelling  of  error  for  a  fortuitous 
calculation  is  indicated  by  the  consistent  performance  for  two  input 
frequency  characteristics.  Longer  runs  would  probably  yield  good 
results,  as  no  error  buildup  was  detected. 


IX- 28 


d.  Final  Simulation  Runs  of  Reciprocal  Calculations  by  Single 
Increment  DDA  with  Non-Classical  Algorithm. 

(1)  Smoothed  First  Difference  Algorithm 


R 

o 


R* 

o 


. 502924997 
.  5 


Input: 


I  = 


_1 _ 

-5 

sin  (2  n) 


(IX-3  3) 


.487306875  First  Differ.  Smooth:  A(Ay  )  =  2  3  (Ay  -  Ay  . 

n  n  n- 1 


Algorithm : 


♦  >  a-  x6 

r  Ay„  +  - 


:  s 


Ax 

n 


(IX-34) 


I  generated  by  addition 
of  A(1  sin  '  2  *n  /  to  an  R* 
register  apart  from  DDA 
system 


Results:  Average  Error  of  .033  (1  pulse) 
after  900  iterations  (thereafter 
increasing  small  improvement 
over  algorithms 


(2)  Digital  Stieltjes  Integration  Algorithm 
Initialisation  same  as  (1) 

Input:  I  «  . .  (IX -3 5) 

2  *1  sin  (2"*n) 

s 


Algor: 


n  ♦  l- 


l£4''n* 

2 p.»*n+l 


5_ 

12 


Z  Dp 
P*n^l  _ 


(IX-3&) 


Results:  Average  Absolute  Error  .011 
(1/3  pulse)  in  2000  iterations. 
Probably  good  in  a  much  longer 


run 


IX-2^ 


(3)  Digital  Stieltjes  Integration  Algorithm. 


Initialization  same  as  (1) 


Input:  I 


2  +  sin  (2  n) 


(IX -40) 


Algor  and  Results  essentially 
same  as  (2) 

(4)  Same  as  (3)  vith  the  Second  Order  Terms  Omitted. 

Initialization  same  as  (3)  Results:  Essentially  same  as  (3) 

Conventional  DDA  algorithms  v. ith  classical  methods  of  realizing 
first  order  terms  had  previously  led  to  large  effective  first  order 
algorithm  error.  Thus  a  very  large  step  in  accuracy  improvement 
was  obtained  in  removing  these  effective  first  order  algorithm 
errors  in  reciprocal  calculation  by  use  of  the  new  lag  correction 
digital  Stieltjes  integration  algorithm.  Since  the  run  results  were 
essentially  that  called  for  by  perfect  computation  within  the  reto- 
1  ition  in  all  of  the  three  runs,  the  fact  that  the  omission  of  the 
second  order  terms  in  the  last  run  made  no  difference  in  results 
implies  no  general  conclusion  one  »ay  or  another  regarding  2nd 
order  algorithm  terms  in  general  computation.  In  parallel  with 
this  simulation  effort  directed  toward  evaluating  and  optimizing 
single  increment  DDA,  v. as  a  demanding  schedule  which  included, 
as  well  as  the  preparation  of  strap-down  computer  evaluation 
tapes,  the  simulation  evaluation  of  a  revolutionary  multi -increment 
DDA  with  second  diffw  r  *nc .  communication.  Inherent  rat*  handling 
capability  and  potential  input  processing  capability  of  a  multi  - 
increment  computer  of  acceptable  cost  was  seen  to  offer  the  basis 
of  design  of  a  full  aerospace  mission  incremental  computer;  the 
primary  goal  of  the  contract  effort.  Further  desirable  simulations 
were  not  made  of  single  increment  DDA  with  the  new  algorithm. 


IX- 30 


The  ultimate  value  of  a  single  increment  ODA  design  capable 
of  division  and  Stieltjes  integration  of  air  data  in  conventional 
airborne  navigation  systems  (where  associated  accuracy  re¬ 
quirements  are  not  high)  is  very  great.  Doppler  damping  of 
improved  accuracy  would  be  possible  without  a  special  ana¬ 
logue  computer.  Previous  failures  of  conventional  DDA  in 
such  functions  have  been  due  not  only  to  the  single  bit  increment 
design  feature,  but  very  significantly  in  inadequate  algorithm. 

A  two  bit  increment  DDA  should  be  provided  together  with  the 
new  digital  Stieltjes  algorithm  in  adapted  form.  Such  a  DDA 
is  entirely  acceptable  in  cost  for  conventional  airborne  navi¬ 
gation  tasks,  and  provides  two  levels  of  improvement  in  air 
data  handling. 

D.  Reciprocal  Calculation  by  the  Multi -Increment  ODD*  A  With  Single 
Increment  Communication  -  During  the  early  intermediate  period 
of  Phase  U  study,  the  design  of  a  multi-increment  computer 
capable  of  division  was  analytically  developed.  The  study  was  made 
on  the  basis  of  concepts  of  second  difference  output  and  single  in¬ 
crement  communication  for  band  limited  variables.  Simulations 
of  single  increment  DDA  with  elaborated  algorithm  had  indicated 
that  significant  improvement,  relative  to  the  conventional  DDA 
algorithm  case  in  computations  involving  Stieltjes  integration,  had 
been  developed.  An  analytical  basis  for  the  digital  Stieltjes  integra¬ 
tion,  including  relatively  sophisticated  lag  correction  methods,  had 
not  been  clearly  defined  at  that  time.  Overall,  the  implication  of 


IX-Si 


aerospace  computer  application  studio,  <  •  <-.  utc.i  .1.  parallel 
with  these  efforts,  demonstrated  that  a  lull  aerospace  mission 
computer  system,  in  which  the  incremental  computer  assumed 
the  major  computation  task,  must  have  a  significantly  higher 
computation  capability  than  a  single  increment  DDA  could  pos- 
sibly  attain.  This  conclusion  was  based  on  the  assignment  of 
such  tasks  as  thrust  cut-off,  strap-down  computations,  and  air  data 
computations  of  re-entry,  as  sul -routines.  All  such  computations 
are  individually  possible  in  a  set  of  single  increment  DDA  com¬ 
puters  of  very  high  iteration  rate,  tied  into  a  single  system.  This 
is  an  expensive  mechanisation  approach  for  the  computation 
capability  obtained. 

Granting  that  input  processing  requires  multi-increment  compu¬ 
tation,  the  mechanisation  advantages  of  time  shared  internal 
computations  imply  that  a  degree  of  multi-increment  bit  length 
for  the  latter  less  demanding  computations  is  potentially  avail¬ 
able,  and  is  called  for  in  optimised  design.  A  several  bit  incre¬ 
ment  internal  computation  not  only  removed  basic  elements  of 
marginality  characterising  single  increment  DDA,  but  is  basically 
available  in  the  time  shared  design  of  a  computer  which  executes 
input  processing.  The  benefits  of  '  rotient  algorithm  had  been 
attained  in  efforts  in  the  computer  field  in  only  single  increment 
computers,  such  as  variable  single  increment.  Benefits  were 
analytically  shown  possible  with  multi-increment  mechanisation 
in  a  highly  integrated  system.  The  object  of  simulations  was  to 
evaluate  the  quantitative  performance  of  a  mechanisation  derived 


IX- 3  2 


on  the  basis  of  the  newly  developed  concepts  of  multi-increment 
computation,  with  second  difference  communication  for  a  compu¬ 
tation  involving  the  process  of  division  by  the  basic  unit  developed. 

For  nondivision  operations  the  structure  of  the  mechanization  in¬ 
sures  the  accuracy  of  a  multi-increment  DDA.  A  major  goal  of 
the  simulation  effort  was  to  demonstrate  the  feasibility  of  multi¬ 
increment  computation  with  second  difference  output  and  communi¬ 
cation  in  a  concrete  example  of  computation. 

1.  Programming  the  Basic  Processing  Unit  (QDPU)  of  the  QDD'lA 
with  Approximate  Second  Order  Algorithm  -  The  basic  trans¬ 
fer  action  of  the  OOD'A  is  a  generalization  of  the  conventional 
DDA,  as  several  transfers  to  a  single  R  register  rather  than 
a  single  transfer  are  performed  in  one  cycle  of  operation.  The 
basic  integration  algorithm  operation  effected  in  the  conven¬ 
tional  DDA.  by  modifying  y  register  quantities  in  accordance 
with  past  values  before  transfer  to  the  R  register,  is  retained 
in  form.and  considerable  quantitative  correspondence  exists  in  the 
QDDaA  transfers.  One  principle  general  difference  in  internal 
operation  .apart  from  output  generation,  when  viewed  as  a 
purely  serial  process,  is  that  a  single  R  register  is  used  in¬ 
stead  of  two  (or  three)  R  registers  as  in  a  conventional  DDA. 

Thus  the  more  nearly  conventional  DDA  computer  program 
used  in  preceding  simulations  could  be  modified  with  respect 
to  internal  operation  to  simulate  a  ODD* A  program  by  pro¬ 
gramming  changes  which  include  the  collection  of  transferred 
quantities  stored  temporarily  in  their  separate  R  registers 
(modified  so  as  to  have  no  overflow  action)  and  then  placing 
the  resultant  in  the  R  register  of  the  last  integrator 
associated  with  the  R  register  of  the  QDPU.  The 


DC- 33 


analytically  derived  most  elementary  unit  of  the  multi- 
increment  QDPU  involves  collection  of  two  multi-transfers 
and  a  single  transfer  (the  latter  serving  to  effect  an  algorithm 
refinement  for  high  accuracy).  Thus  the  transfer  action  is 
that  of  two  multi-increment  ODA  integrators  and  one  single 
increment  ODA  integrator.  A  single  multi- transfer  unit  and 
one  single  transfer  unit  could  execute  the  cycle  of  operation 
for  reciprocal  calculation  with  the  same  cycle  time  (2  word 
time)  as  in  a  conventionally  conceived  serial  multi-increment 
DOA  provided  with  multi -increment  communication.  However, 
to  effect  the  identical  processing  as  the  QDPU.  there  would 
have  to  be  whole  woH  communication  of  R  register  quantities. 

A  second  major  difference  in  this  QDPU  operation  is  the  out¬ 
put  criterion  (for  second  difference  outputs)  which  is  not  of 
the  natural  overflow  type.  Hence  the  output  criterion  required 
a  special  programming  modification  to  replace  the  simulation 
of  natural  overflow.  The  third  basic  processing  modification 
is  that  in  the  absorption  of  inputs.  The  input  second  differ¬ 
ences  are  accumulated  in  separate  (short)  registers  to  form 
first  differences  which  are  then  used  as  Ax.  Ay  quantities  in  a 
conventional  DDA  integrator  where  the  update  y  ♦  Ay  and  trans¬ 
fer  y  Ax  are  executed.  The  nature  of  the  simulation  program 
for  conventional  DDA  makes  this  direct.  A  final  minor  pro¬ 
gramming  change  is  the  replacement  of  second  order  terms  of 
the  derivary  type  with  the  direct  second  order  differences 
communicated.  It  is  clear  that  one  of  the  many  merits  of 
second  difference  communication  is  that  second  order  algor¬ 
ithm  terms  can  be  effected  as  simply  as  first  order  algorithm 
terms  in  a  conventional  DDA  with  first  difference  communication. 


IX-34 


A  programming  of  the  elementary  unit  of  the  QDPU  may  be 
expressed  generally  as  having  the  form: 


(1) 

*r<3> 

s 

7<3,Ax<3,  +  ar(2>  +  *r  <*> 

(IX -38) 

n 

n  n  n  n 

*r<2) 

y  <2)  6x  <2> 

n 

n  n 

ar<” 

n 

n  n 

where 

-  R  ™ 
n 

9 

R  ^  -  R^k*  ,  R  ^  being  contents  of  the  k*  R  register, 
n  n- 1  n  • 

^(k) 

7n 

S 

r  (k)  (k)  -  (k)  (k)  .2  (k)| Classical  Inte 

I  n  1  n  2  Jn  J  gration 

(rX-  39) 

Algorithm 

Realisation 

or 

n  n 

(k) 

yn 

S 

fy  <k>  +  S<k>  Lty™  ♦  S,(k,iS  d2yk 

n  1  P,  2  7p 

L  P  *  n  ♦  1  P  =  n  ♦  1 

n  n 

(IX-  40) 

^  (k)  I  Digital  Stieltjes 

nn+lJ  Integration  Algorithm 


where  n^  is  the  iteration  at  which  the  previous  non- aero  transfer  occurred. 


(2) 

420  (3k) 
n 

*  ••»Rn(3k)»«»(yn(2>) u  <2|r„3|-  K|y„2|> 

(IX-  4i) 

where 

A2*  (3k) 
n 

th 

is  the  k  ODPU  output  and 

(3) 

Ax  <r> 

n 

n- 1  n 

(IX-42) 

•».w 

are  update  rules  when  the  k***  QDPU  output  is  programmed  to 
be  input  to  the  r^1  integrator  for  x  or  s**1  integrator  for  y, 
or  for  external  inputs. 


DC-  35 


(IX-43) 


Ax(r>  « 

6l 

i 

n 

^  - 

a1, 

(m) 

(m) 


where  external  input  A  I^1  1  enters  QDPU. 

Z.  Program  for  Reciprocal  Calculation  by  the  ODPU  with 
Approximate  Second  Order  Algorithm  -  In  Chapter  II  a 
formal  approximate  second  order  numerical  algorithm 
for  division  processes  by  integration  is  derived.  The 
computation  of  the  integral  increment 


(IX- 44) 


AO 


=  ;  £dx 

n  <n-ir v 


by  second  difference  output  is  shown  to  involve  the  R 
register  computation 


(IX-45) 


A  R  =  p 
n  *n 


(IX-46) 


where  p  .  V  are  modified  p  ,  V  quantities  for  integration 
*n  n  n  n 

algorithm  appropriate  for  lagged  or  unlagged  quantities.  The 
reciprocal  calculation 

•  *  1/1 

satisfies  the  differential  equation 

6dl 


dt 


A  0 


I 


hence  the  integral  increment  of  A  ®n  i* 

r  c-Odi 

»  j  j 

(n-l)' 


(IX-4'') 


IX- 36 


V 

n 


V 

n- 


1 


+ 


AVn 

12 


which  haa  the  general  form  of  the  algorithm  analysis,  taking 
p  s  .P,  q  >  I,  x  *  I.  Here  V,  x  are  unlagged  since 
inputs  A I  and  updating  of  1  are  chosen  as  unlagged.  Since  0 
is  a  feedback  quantity  the  quantity  p  is  lagged.  Therefore  the 
appropriate  integration  algorithms  are 

-  -(Vi  *!*n*\] 

*  [i  -  |  dl  -T5-42I  1 

^  n  2  n  12"  nj 

«  fl  -ii  Ail 

[n  12  nj 


(IX -48) 
(IX- 49) 

(IX- 50) 


the  last  relation  entering  as  a  first  order  accuracy  term 
adequate  to  obtain  the  approximate  second  order  accuracy. 
The  analysis  of  Chapter  II  therefore  calls  for  in  reciprocal 
calculation  the  programming  according  to  the  form  stated  in 
the  last  section  such  that 


‘Ri  ■  [«„*t  4«„*na2e»][-4In] 

■  (Vi  *■„-  a  '*']  [-4i]  (IJC-51' 

**,  '  ]  [A] 

where  the  superscripts  O  are  dropped  from  8  quantities  which 
are  understood  to  be  inputs  rather  than  outputs.  The  output 
criterion  derived  in  Chapter  VII  for  second  difference  out¬ 
put  is 


1X-37 


where 


R(3).  +  AR 
n- 1 


0) 


n 


+  AR 


(2) 


+  AR 


(3) 


end  Vr  ie  stated  above  and  K  selected  for  given  scale  of 
second  difference  output. 


3.  Scaling  and  Input  Generation  for  the  QDPU  in  Reciprocal 
Calculation  Simulations  -  The  input  A I  was  generated  in  the 
same  programming  structure  as  in  single  increment  DOA 
simulations  but  with  a  quantisation  for  multi* increment 
rather  than  single  increment.  The  calculations  simulated 
were  g  *  1/1  where 


1  *  A  ♦  B  sin  I  n 
o 


for  the  cases  A  «  2,  B  *  7/8  and  A  *  3.  B  ■  7/8  where 
generally  »  2~5.  For  a  QDPU  with  a  single  multi-transfer 
unit  that  cycles  in  two  word  times,  the  first  simulation  cor¬ 
responds  to  the  same  physical  input  tested  in  single  increment 
DDA  (about  1  cps  input  for  200  iter/sec  DDA).  For  the  ODD*A 
with  two  multi-transfer  units  for  one  channel  of  computation, 
the  simulation  evaluates  results  for  inputs  with  twice  the 
frequency  tested  for  conventional  DDA.  The  program  for 
generation  of  Ai  inputs  to  the  ODPU  is  essentially  exterior 
to  the  ODDA  program  structure.  Multi- increment  *1  were 
generated  by  R  register  action  as  in  a  DDA  with  multi¬ 
increment  output, in  which  whole  word  il  increments  are 
added  to  R  with  a  bias-free  multi- increment  output  criterion 
being  programmed.  The  inputs  arc  expected  to  closely 
simulate  the  accumulated  inputs  of  a  pulse  stream  input 


IX -38 


device,  where  accumulation  occurs  over  the  iteration  interval 

of  the  QDDA.  The  maximum  increment  A 1  of  physical  input 
•  5 

1  is  <  2  part  of  full  scale  of  I, for  both  single  and  multi* 

increment  computation.  As  a  result  of  the  fact  that  the 

-10 

second  difference  changes  at  most  2  ,it  was  possible  to 
simulate  ODDA  with  single  increment  second  difference 
communication  in  a  computation  with  10  bit  registers  in¬ 
stead  of  the  short  (5  bit)  registers  forced  on  a  single  trans¬ 
fer  DOA.  The  computed  input  was  actually  9q~  *4l  computed 
as  five  bit  numbers  s  1  in  absolute  value  therefore  having  scale 
of  2*5.  The  computed  1 1  was  programmed  as  independent 

variable  with  unit  scale  and  as  y  register  update  I  with  scale 

-2  -5  -7  —  -2 

2  x  2  *  2  the  first  factor  taking  into  account  1  «  2  I 

and  the  second  taking  into  account  physical  scale;  then  I  <1. 
The  '  20  outputs  of  +1,  -1  or  0  were  programmed  to  update  t 
in  a  y  register  with  scale  2~  ;  then  IIM.  The  computation 

in  machine  variables  was  programmed  wi  th  a  machine  scale 
so  that 


R 


1 


•V 


*  T» 

*  *»<J’ [-****  **'«] 


(IX-53) 


has  AR,  with  2*  physical  scale  of  A I  and  since  I  had  physical 

*  2  B  2-3 

scale,  the  scales  S  g  were  necessarily  g  *  l 

since  T  >  2**1  and  A2  9  has  2* 10  physical  scale,  as  follows  from 

the  equation  25  «  2"2  S.m  2+1°.  The  output  computation  utilized 

Al  2 

the  same  scale,  hence  K  *  ^ 


IX- 39 


4.  Initial  ODPU  Simulations  -  The  initial  simulation  was 
primarily  intended  to  establish  the  analytically  derived 
principle  of  multi-increment  computation  with  single  in¬ 
crement  output  and  communication.  After  program  de¬ 
bugging  was  complete  a  successful  run  with  algorithm 
terms  (classically  mechanized  1**  and  2****  order  terms), 
and  input 


*r  - 

1  <s  (1)  _ 

2  '  s2 

5 

12 

(IX- 54) 

s  . 

1  <s  U)  _ 

2  *  S2 

1 

‘  12 

1  *  2+~sin(2"5  ) 

o  n 

s .«*»  . 

13  q  (3) 

" 

*  0 

was  executed  in  a  10,  000  iteration  run.  A  peak  error* 
during  the  run  was  0.  0073  (absolute  average  over  50  itera¬ 
tion)  in  the  computation  involving  increments  <  0.  0325  per 
iteration.  Regarded  as  linear  error  growth  %  the  improve¬ 
ment  over  conventional  DDA  was  a  factor  of  150  for  serial 
transfer  ODPU.  The  contemplated  CDPU  would  handle 
twice  the  frequency  input  with  the  same  performance.  Note 
that  a  second  order  term  produced  by  the  1st  order  I  term 
of  '  differs  some  what  from  the  analytically  derived  algorithm. 

The  results  would  presumably  have  been  improved  somewhat,  had  the 
intended  algorithm  been  simulated.  Second  order  terms 
AR<*>  and  AR(2)  are  shown  in  later  simulations  to  not 
significantly  affect  reciprocal  calculation.  The  second  run. 


♦Peak  error  in  direct  printout  each  1000  iterations  was  0.  0047.  The  average 
absolute  error  printout  is  questionable  here  and  in  third  run. 


DC-40 


which  will  be  described  indicates  (hat  there  may 
be  greater  sensitivity  in  the  to  terms  of  this  order 

of  magnitude.  The  second  run  was  primarily  intended  to 
evaluate  the  importance  of  the  A  term,  the  approximate 
significance  of  which  is  deduced  by  the  first  order  approximation. 

♦  4  R,  ■  -  (?„  -  |  •  T„)(  „  *  '  \  )  «X-5  5) 

differing  from  A  by  use  of  (£  instead  of  A  ^ 

Thus  *  Rj  primarily  extrapolates  A  9  as  an  independent 

variable  iteration  ahead.  The  second  simulation  program 
differed  from  the  first  only  in  that  AR^  *  0  (discard  of 
“Rj  term).  In  1000  iterations  the  run  with  modification  for 
Rj  *  0  showed  an  error  of  0.  030,  and  thereafter  degraded 
to  near  full  magnitude  error  in  3000  iterations.  Since  A  R^ 
is  a  term  of  l-t  order,  the  omission  of  would  be  expected 
to  produce  errors  up  to  the  magnitude  comparable  to  1st 
order  algorithm  error  in  a  sinusoid  calculation,  as  was  found 
to  be  the  case  in  reciprocal  calculation.  A  third  simulation 
had  the  purpose  of  testing  a  refinement  in  algorithm  digital 
realisation  based  on  generalisation  of  the  digital  Stieltjes 
algorithm  tecnnique  derived  during  phase  2  first  for  single 
increment  computation  as  then  formulated.  Conventionally 
mechanised  integration  algorithm  was  seen  to  fail  in  algorithm 
transmission  for  the  cumulative  period  during  which  *x  *  0 
till  Ax  jf  0,  which  is  most  pronounced  in  single  increment 
computation  because  of  the  high  frequency  of  such  periods. 

A  unified  algorithm  technique  for  general  DDA  computation 
would  incorporate  this  special  action  during  the  less  frequent 


1X-41 


period*  in  which  A  x  *  0  to  Ax  /  0.  It  will  be  noted  that 
Amble' •  method  with  multi- increment  quotient  algorithm 
computation  would  be  expected  to  retail  in  high  degree,  the 
•pacific  property  that  output*  be  generated  only  when  Ax  t  0, 
and  further  that, when  generated, are  delayed  one  iteration 
in  feedback.  Thus  the  specific  format  programmed  according 
to  the  then  formulated  digital  Stieltjes  algorithm, is  equi¬ 
valent  to  the  more  general  formulation,  including  lag  cor¬ 
rection  digital  Stieltjes  integration  algorithm  derived  in  a 
preceding  section.  The  general  order  of  magnitude  of  effect 
expected  by  the  refinement  in  multi- increment  computation 
would  be  a  first  order  algorithm  error  effect  reduced  by  a 
factor  of  2“***+*. where  M  is  multi- increment  bit  length, 
according  to  the  following  argument  which  holds  for  only  the 
zero  crossings  of  the  desired  variable  (not  the  partial  rate 
of  zeros  in  fraction  representation  where  a  large  value  is 
represented  by  the  pulse  stream).  The  frequency  of  zero 
crossings  of  the  (desired)  numerical  variables  is,  of  course, 
independent  of  M.  The  duration  of  A I  *  0  in  digital  repre¬ 
sentation  is  proportional  to  2-M  +  l.  The  1st  order  integra¬ 
tion  algorithm  integrand  term  magnitude  is  then  proportional 
to  The  scale  of  effect  in  transfer  yd  x  is  the  product 

of  these  scales, hence  the  factor  The  evaluation  of 

partial  rate  effects  for  large  Ax  in  single  increment  com¬ 
putation  is  more  involved.  For  M  ■  5,  as  in  the  CDPU 
simulation  the  reduction  of  effect  relative  single  increment 
would  be  1/256,  assuming  the  major  error  effect  occurs 
around  zero  crossings  of  the  desired  variable.  In  the 
pertinent  calculation  the  error  effects  is  a  2  x  10  *  error 


increment  per  iteration  for  single  increment  and  accordingly 
is  estimated  as  0.  8  x  10  for  5  bit  increment  computation. 

In  10,  000  iterations  an  error  of  0.  008  would  be  consistent 
with  this  error  model  for  digital  Stieltjes  integration  effects, 
in  agreement  with  the  error  level  determined  in  simulation 
of  the  ODPU  without  the  digital  Stieltjes  algorithm.  The 
third  simulation  incorporated  the  special  digital  Stieltjes 
algorithm  refinement  just  analysed.  A  problem  was  pre- 
sented  in  evaluation  of  results  because  of  a  probable  error 
in  programming  the  sum  of  absolute  errors  over  50  iteration 
intervals.  The  largest  error  in  direct  printout  of  a  iteration 
result  was  0.  0047,  recorded  once  each  1000  iterations.  The 
single  sample  errors  showed  essentially  linear  amplitude 
growth.  However, from  the  run  start  the  printout  of  average 
absolute  error  for  50  iteration  ensembles  was  a  0.  0078  near 
start, growing  to  0.  0100.  The  latter  appears  to  reflect  a 
programming  error  with  a  bias  in  a  modified  average  error  evaluation 
routine.  Assuming  this  to  oe  tne  case, the  digital  Stieltjes 
algorithm  refinement  reduced  error  somewhat  (from  0.  0057 
to  0.  0047)  in  the  multi* increment  computation,  but  not  in 
the  degree  hoped  for.  Perfect  computation  to  within  reso¬ 
lution  could  yield  *  0.  0005. 

5.  Simulations  of  Reciprocal  Computation  by  the  Multi-increment 
QDPU  With  Second  Order  Algorithm  and  Modified  Input 
Functions  -  Preparation  of  a  set  of  runs,  simulating  the 
OOD^A  alternative  second  order  algorithms  and  two  different 
input  functiona  was  carried  out  in  a  parallel  effort  with  that 
for  the  first  CDPU  runs.  Second  order  algorithms  were 


DC-45 


programmed  to  correspond  to  a  conventional  realization  of 
individual  algorithm  term*  but  not  aleo  generalized  digital 
Stieltjea  algorithm  realization  for  multi-increment  computa¬ 
tion,  as  in  the  preceding  rune,  which  had  not  been  fully  eval¬ 
uated  (as  a  result  of  a  programming  error  in  the  average  ab¬ 
solute  error  estimate  for  ensembles  of  50).  These  runs 
served  to  evaluate  the  degree  of  algorithm  mechanization 
simplification  tolerable  for  reciprocal  calculation  as  well  as 
performance  improvement  for  a  calculation  with  input,  which 
more  closely  matches  the  demands  of  typical  application  i.  e.  , 
a  less  demanding  calculation.  The  first  of  this  set  of  runs 
(the  of  ODPU  runs)  evaluated  the  approximate  second 
order  algorithm,  stated  in  an  earlier  section,  and  derived 
on  the  basis  of  theory  developed  in  Chapter  11,  that  is  to 
within  the  level  of  approximate  second  order  algorithm  stated. 
The  simulated  algorithm  is  defined  by 


1/2  • 

5/12 

(IX-56) 

s ■ 

■V*  • 

0 

Run  4(a),  (b) 

■ 

-1/2  Sj31  * 

0 

in  AR 

n 

»>.  Ar  «*».  Ar  t3» 

n  a 

respectively,  differing  from  the 

derived  algorithm  by  6  percent  of  a  second  order  term  (in 
The  labeling  of  runs  4(a)  and  (b)  means  runs  for 
the  calculations. 


*  .  -  _! - -  (a)  (IX -57) 

(2  ♦  ^  sin  2*  n) 


IX-44 


1 


(b) 


(IX -58) 


s  -  . . .  ■'  . . .  — . 

(3  +  ^  sin  2"  5  ) 
o  n 

respectively.  The  first  calculation  (a)  involves  p  register 
8  8 

swinging  from  js-  to  r?  in  the  same  time  in  (b)  the  0  register 

g  g  ™ 

swings  from  to  corresponding  to  2.  5  times  the  average 
rate  of  change.  Runs  for  comparison  with  runs  4(a),  (b) 


were 

s,“> 

s 

1/2 

hm 

8 

0 

(IX-  59) 

s<» 

s 

-1/2 

h” 

8 

0 

Runs  5(a),  (b) 

s<” 

8 

-1/2 

hw 

8 

0 

s<» 

8 

1/2 

hm 

8 

1/2 

ax-  60) 

s<« 

8 

-1/2 

8 

-1/16 

Runs  6(a),  (b) 

s/3’ 

8 

•1 

hm 

8 

0 

Before  presenting  results  of  these  simulations,  it  is  desirable 
to  indicate  that  the  general  algorithm  form  has  equivalences 
when  1st  and  2n<*  order  terms  are  mechanised  in  the  con¬ 
ventional  manner.  One  of  these  equivalences  involves  -  Rj 
and  *  Rj  such  that  since  *  41  A  it  follows  that  any 

choice  of  S ^  and  S^3*  such  that  (sj^  ♦  S^3*)  is  unchanged 

implies  algorithm  equivalence  (in  convention  realisations). 

Thus,  for  example  Run  4  with  ^  »  5/12,  *  -3/2 

is  equivalent  to  a  run  with  ■  0,  Sj^  ■  -  "JJ  • 


IX-45 


For  Run  5  with  S2^  =  0,  S^3^  *  >1/2  is  equivalent  to  a 

run  with  S2(1)  *  1/2,  Sj(3)  »  -1.  For  Run  6  with  S2(I)  «  1/2 

and  S^3^  *  -1  is  equivalent  to  a  run  with  S2^  ■  9/16, 

s/3)  *  -17/16.  As  stated  first.  Run  5  simulates  a  QDPU 

with  two  multi-transfers,  one  of  which  involves  ('•'  +  ) 

n  n 

as  independent  variable  (by  combining  AR2  and  -R^).  Simu¬ 
lated  ODD^A  and  performance  which  agreed  with  the  initial 
runs  in  that  greatest  error  occurs  at  the  peak  of  0  values 
and  attenuates  after  peaks  with  only  a  gradual  error  drift 
being  generated  permanently.  All  runs  for  calculation  (*) 
had  errors  within  a  factor  of  two  of  each  other  at  the  same 
iteration  count.  Near  the  peak  11  values,  Runs  (4),  (5),  (6) 

(a)  led  to  peak  errors  of  0.  010,  except  for  Run  5(a),  which  was 
printed  out  at  one  point  closest  to  the  maximum  0  for  which 
error  of  -0.  0180  at  *  0.  89,  and  error  of  0.  0078  at  r  *  0. 84, 
were  observed  near  the  end  of  10,  000  iteration  run.  It  is 
deduced  that- more  detailed  printout  would  have  showed  that 
average  error  over  a  p  cycle  would  be  subetantially  under 
the  peak  errors  at  the  p  peaks  of  1/4  increment  (peak)  error 
(3  bits  of  a  5  bit  increment!  perhaps  being  1/8  increment 
average  error  (2  bits  of  5  bits  increment).  The  less  demanding 
calculation,  labled  (b),  showed  for  Runs  (4),  (5),  (6)  (b)  a 
similar  closeness  in  comparative  performance  throughout  run 
periods,  where  essentially  linear  error  growth  was  observed. 

As  expected  markedly  better  absolute  performance  resulted, 
error  drift  (from  initial  biases)  building  up  to  about  -0.  003 


IX- 46 


in  each  cate,  amounting  to  aoout  percent  of  an  increment 
per  iteration  after  10,  000  iterations.  The  comparative  40 
percent  average  rate  of  variation  of  the  1  function  in  calculation 
(b)  compared  to  (a)  wai  more  than  matched  by  fractional  im¬ 
provement.  The  implications  of  simulation  results  which 
showed  closely  similar  performance  for  different  second 
order  algorithms  (mechanised  by  conventional  identification 
of  communicated  increments  with  increments  of  the  true 
variable,  rather  than  use  of  digital  Stieltjes  identification, 
meaning  that  substantial  improvement  in  performance  prob¬ 
ably  must  be  attained  by  the  refinement  of  digital  Stieltjes 
algorithm.  This  conclusion  is  considered  all  the  more  perti¬ 
nent  for  three  bit  increment  computation  instead  of  the  five 
bit  increment  computation  simulated.  A  seventh  run  was  made 
with  slight  perturbations  in  initial  condition  of  Iq  and  8^  rela¬ 
tive  Run  4(a)  to  determine  the  degree  of  algorithm  transmission 
improvement  by  subsignificant  biases  (longer  register  length 
with  artificial  word  length).  Thus  Run  7(a)  used 

I  *  1/2  ♦  3x2' 15  (IX-61) 

o 

*  *  1/2  -  3x2‘15  (IX-62) 

o 

as  a  modification  of  Run  4(a).  Run  7(a)  was  similar  to  Run 
4(a)  but  had  peak  error  of  0.  0087  instead  of  0.  0100,  amounting 
to  a  13  percent  improvement.  The  limited  level  of  improve¬ 
ment  indicated  that  the  digital  Stieltjes  refinement  was  still 
the  direction  for  any  major  improvement. 


DC-47 


In  summary  of  overall  QDD^  A  performance  as  compared  to 
conventional  ODA,  the  error  growth  in  reciprocal  calculation 
of  10**  per  iteration  displayed  in  five  bit  increment  QDDSA 
simulations  of  calculation  (a)  as  compared  to  2  x  10"4  per 
iteration  for  conventional  ODA,  demonstrates  a  factor  of  200 
improvement  in  accuracy.  The  potential  in  this  case  is  perhaps 
1000  with  digital  Stieltjes  algorithm  refinement.  The  stated 
comparison  has  been  made  for  a  two  word  time  QDPU  with 
a  single  multi-transfer  unit.  A  parallel  QDPU  (one  word  per 
cycle)  would  be  able  to  compute  reciprocal  for  inputs  with 
twice  the  frequency  simulated  with  the  same  performance. 

Simulations  of  Full  Rate  and  Variable  Rate  Sinusoid  Calculations 
by  Single  Increment  DDA  with  Elaborated  Algorithm  ai.d  the  QDD^A  - 
The  simulations  during  Phase  II  of  full  rate  and  variable  rate  sinu¬ 
soid  calculations  by  various  DDA  mechanizations  had  the  objectives 
of: 

1.  Duplicating  certain  runs  made  during  Phase  1  on  the  Alwac 
computer  in  order  to  checkout  programming  during  Phase  11 
for  the  704  computer,  and  verify  previous  results  on  Alwac 
during  Phase  I.  The  duplicated  runs  were  full  rate  sinusoid 
calculations.  Evaluation  of  individual  design  features  of  the 
derivary  integrator  of  Phase  I  was  desired. 

2.  Evaluation  of  error  magnitudes  and  error  correction  in  sinu¬ 
soid  calculation  by  conventional  and  elaborated  DDA  mechani¬ 
sations  where  the  independent  variable  is  partial  or  variable 
rate.  Specifically  to  evaluate  the  digital  Stieltjes  integration 


algorithm  developed  during  Phase  11  as  a  result  of  reciprocal 
calculation  simulations. 

3.  Provided  that  the  programming  schedule  for  strap-down  pro¬ 
cessor  evaluation  permitted,  to  evaluate  the  QDD^A  in  multi¬ 
increment  sinusoid  calculation:  however  this  did  not  prove 
possible.  There  was  preliminary  consideration  of  QDPU 

single  increment  mode  for  sinusoid,  but  the  system  applica- 
2 

tion  of  the  QOD  A  for  sinusoid  generation  was  determined  to 
be  best  chosen  for  execution  in  multi-increment  computation 
as  a  result  of  sinusoid  error  sensitivity  in  relation  to  proces¬ 
sing  rate,  resolution,  and  general  precision,  which  is  attained 
completely  within  the  mechanisation  structure  of  the  proposed 
QDPU  developed  on  general  computation  function  bases. 

a.  Simulation  of  full  rate  sinusoid  calculation  by  conventional 
and  derivary  DDA-  Results  of  the  full  rate  single  incre¬ 
ment  sinusoid  calculation  simulations  provided  verifica¬ 
tion  of  Phase  1  quantitative  and  comparative  performance 
of 'conventional  DDA  and  the  DDA  with  derivary  communica¬ 
tion  developed  during  Phase  1.  Three  innovations  character¬ 
ised  the  derivary  integrator: 

(1)  The  technique  of  subsignificant  biases  of  y  registers 
for  improved  algorithm  transmission,  at  the  price  of 
increased  register  length  with  given  granulatiry. 

(2)  The  communication  of  second  order  terms  in  a  digital 
representation  in  which  algorithm  transmission  is 
improved.  Derivary  and  programmable  algorithm 
features  in  otherwise  conventional  DDA  were 


D$-49 


simulated.  Oerivary  uses  the  changes  in  the  trans- 
ferred  quantity  to  the  R  register  of  communicating 
integrator  as  involved  in  Phase  I. 

(3)  Overflow  inhibitor  technique  which  greatly  reduces 
roundoff  error  effects  associated  with  low  rate  in* 
tegrator  output  (as  involved  in  Phase  I). 

The  reported  performance  improvement  was  verified 
iteration  for  iteration,  but  the  relative  improvement 
accordable  to  the  derivary  innovation  was  not  fully  de¬ 
termined.  The  value  of  the  overflow  inhibitor  was  veri¬ 
fied  repeatedly  in  these  simulations.  The  value  of  sub¬ 
significant  biases  was  confirmed  but  a  question  of  the 
essentiality  of  the  derivary  terms  was  resolved  to  a 
question  of  programming  error  in  a  late  phase  run.  The 
case  in  point  was  a  run  (made  serially  with  two  other 
runs  at  the  same  704  run  appointment)  which  was  sup¬ 
posed  to  simulate  the  derivary  integrator  with  an  over¬ 
flow  inhibitor  and  subsignificant  biases  in  which  the  de¬ 
rivary  communication  was  cut  off.  The  run  duplicated 
iteration  for  iteration  the  results  of  the  complete  deriv¬ 
ary  integrator  repeated  in  Phase  1  and  11.  While  for  the 
high  frequency,  sinusoid  simulated  this  is  not  necessarily 
impossible  as  a  result  of  chance  dtaphantine  relations 
of  R  register  value  for  the  sinusoid  amplitude,  a  pro¬ 
gramming  error  of  failing  to  cut  off  the  derivary  terms 
is  judged  likely.  The  analytical  implication  that  sub¬ 
significant  biases  primarily  offer  the  improved  trans¬ 
mission  of  second  order  terms  rather  than  the  first 


order  terms,  which  on  analytical  bases  should  be  trans¬ 
mitted,  tends  to  indicate  a  programming  error  was  made. 
The  specific  simulation  parameters  and  summarized  re¬ 
sults  (the  latter  discussed  above)  are  presented  below: 

Run  1  Repeat  of  run  summarized  on  page  236  of  the 

Phase  I  report.  Run  results  identical  on  704  at 
each  iteration.  Again  the  large  reduction  of  fre¬ 
quency  shift  from  thi  theoretical  relative  conven¬ 
tional  ODA. 

Run  2  Repeat  of  run  summarized  on  page  232  of  the 
Phase  I  report.  Results  same  on  704. 

Run  3  Simulated  ODA  to  repeat  Run  1  identically  except 
for  cutoff  of  derivary  terms.  Results  identical  to 
Run  1  iteration  for  iteration  indicating  possible 
programming  error. 

The  design  implications  for  full  rate  integration  by  near 
conventional  integrators,  if  the  results  of  Run  3  were  cor¬ 
rect,  would  be  a  OOA  integrator  with  an  overflow  inhibitor 
and  subsignificant  bias  developed  during  Phase  I  but  with¬ 
out  derivary  communication.  Apparently,  register  length 
increase  could  be  avoided  by  introducing  a  bias  of  1/2  bit 
(rather  than  smaller  biases)  in  the  y  register,  since  it  is 
analytically  observed  that  the  only  palpable  purpose  of  the 
bias  is  to  assist  transmission  of  the  algorithm  terms  of 
least  magnitude  which  in  first  order  algorithm  is  1/2 
bit  (whereas  in  derivary  second  order  algorithm  it  is  much 
smaller).  Other  simulations  during  Phase  11  of  OOA  and 


IX-51 


QDD3 A  do  not  clearly  resolve  the  question  of  second 
order  integration  since  digital  Stieltjes  integration  effects 
in  partial  rate  computations  involve  errors  of  magnitude 
lying  between  first  and  second  order  effects.  However, 
in  all  cases,  the  use  of  second  order  terms  either  im¬ 
proved  results  or  made  insignificant  difference  in  results 
(being  masked  by  the  large  error  effects  alluded  to). 
Recommendations  are  that  further  runs  be  made  to  resolve 
the  question  of  relative  significance  of  these  component 
design  features  (especially  for  the  genralised  forms  of 
these  features  in  the  important  problem  of  multi-increment 
computer  design). 

Uncompleted  Programming  of  QDD^  A  Sinusoid  Calcula¬ 
tion  -  Programming  assignments  were  made  for  simula¬ 
tion  of  multi -increment  sinusoid  calculation  by  the  QODA 
in  the  event  that  the  simulations  could  be  successfully 
completed  before  previously  scheduled  programming 
efforts  for  strap-down  processor  evaluation.  A  first 
program  was  laid  out  but  was  not  successfully  debugged 
because  adequate  programmer  time  was  not  available. 

The  intent  of  the  QDPU  sinusoid  runs  was  to  establish 
the  level  of  accuracy  in  multi -increment  computation  with 
and  without  overflow  inhibitor  action  designed  to  be  a  gen¬ 
eralisation  of  the  overflow  inhibitor  technique  developed 
in  Phase  I.  Since  QOPU  reciprocal  calculation  does  not 
involve  registers  with  null  values,  the  play  of  error 
factors  for  which  the  overflow  inhibitor  was  developed 
to  overcome  does  not  occur  in  reciprocal  calculation. 


The  overflow  inhibitor  mechanization  design  will 
require  certain  modifications  to  be  a  generalization  for 
multi -increment  (from  the  single  increment  case). 

The  considerations  in  the  development  of  the  generaliza¬ 
tion  are  similar  to  those  in  the  generalization  of  digital 
StieltjeB  algorithm  for  single  increment  to  multi-incre¬ 
ment  computation.  The  first  runs  contemplated  for 
overflow  inhitor  test  were  based  on  inhibition  when  the 
least  significant  bit  of  y  register  is  zero.  The  general 
output  criterion  is  modified  to  inhibit  output  until  the 
condition  no  longer  erlsts.  In  computing  a  full  rate 
sinusoid  in  a  QDDaA  it  is  possible  to  double  register 
length  (M  to  2M)  of  the  sinusoid  relative  a  single  incre¬ 
ment  DDA  yet  retain  single  increment  communication. 

The  mechanization  price  of  multi-transfer  operation 
ranges  from  that  of  conventionla  M  bit  multiplier  to  that 
of  a  three  bit  increment  multiplier  in  the  D*  multiplier 
developed  in  Chapter  XIII,  which  is  capable  in  this  case  of 
a  remarkable  mechanisation  economy  when  M>  3  (and 
second  differences  are  single  increment).  Contemplations 
were  to  first  simulate  the  sinusoid  for  M  *  5  case  for 
comparison  with  reciprocal  calculation  results  which 
had  M  »  5. 

c.  Simulations  of  Partial  and  Variable  Rate  Sinusoid  Com¬ 
putations  by  Single  Increment  DOA  and  Further  Verifica¬ 
tion  of  the  New  Digital  Stieltjes  Integration  Algorithm. 

The  computation  of  sin  I,  cos  I  where  the  input  is  formed 
by  R  register  overflow  with  transfers  of  SI  »  K  ♦  KaA(sin0on), 
was  chosen  for  a  study  of  partial  and  variable  rati 


IX-53 


sinusoid  computations  of  single  increment  DDA  with 
overflow  inhibitor  with  and  without  digital  Stieltjes  al¬ 
gorithm.  The  first  two  runs  simulated  a  sinusoid  cal¬ 
culation  with  constant  partial  rate  input  K  *  5/16,  Ka  *  0 
with  four  bit  registers  for  sine  and  cosine  with  added 
subsignificant  bits  and  initialization  and  algorithm  given 


by: 

y  =  .0078125 

7o 

yQ  =  .37890625 

Initialisation 

R  r  .50390625 
o 

R  «  .4921875 
o 

Sil«  .5 

Si  *  >  -.5 

Integration 

Algorithm 

Sa1  «  .4167 

^ 3  *  -.08333 

_« 

T  «  2  Overflow  Inhibitor 

Magn.  Criterion 

Simulations  were  made  for  the  two  cases  of  not  using  and 
of  using  the  digital  Stieltjes  algorithm  derived  earlier  in 
this  chapter.  The  first  case  simulates  the  derivary 
integrator  developed  (during  Phase  1)  for  full  rate  calcu¬ 
lations.  while  the  latter  simulates  the  new  digital 
Stieltjes  integrator  developed  for  general  independent 
variables.  Results  in  the  two  cases  are  expressed  as 
peak  average  absolute  error  over  ensembles  of  SO  Itera¬ 
tions  of  the  larger  of  the  computed  sines  and  cosines 
error  values,  summarised  as  follows: 


Partial  (Constant)  Rate  Simulations  Results 


Error 
Error 

Pulse  size  was  .  0625  and  exact  stable  calculation  within 
resolution  would  result  in  error  1/4  (.0625)  *  .0156 
error  in  the  average  absolute  error.  These  results  are 
another  confirmation  of  the  value  of  the  digital  Stieltjes 
algorithm  for  independent  variables  which  are  not 
full  rate. 

A  second  set  of  simulations  was  programmed  and  run 
for  the  purpose  of  evaluating  the  relative  performance 
of  conventional  and  digital  Stieltjes  algorithm  single 
increment  DDA  for  inputs  characterized  by  modulated 
partial  rates.  Programming  errors  were  detected  on 
the  basis  of  the  exact  solution  used  to  evaluate  the  runs 
which  did  not  correspond  to  assigned  solution;  therefore 
no  useful  results  are  available.  The  assigned  runs  were 
for  the  same  initial  DOA  conditions  as  the  preceding 
runs  for  pure  partial  rate  modified  for  the  new  inputs : 


A I  =  2"4 
n 

(IX-63) 

Al  *  2"4  +  7/8  A  (sin  2"'n) 

(IX-64I 

n 

A  I  =  21*  +  7  A(sin  2*'nJ 
n 

(IX -65 

Results  of  these  simulations  would  be  valuable  in  further 

evaluating  digital  Stieltjes  integration  processes  and 
associated  algorithms. 


Period;  4000  iterations  8000  iterations 

of  Derivary  Integrator  System  .082  (Run  stopped) 

of  Digital  Stieltjes  Integrators  System  .020  .022 


IX-  55 


CHAPTER  X 


QUANTITATIVE  EVALUATION  OF  COMPUTATION  AND  RATE  HANDLING 
CAPABILITY  OF  DDA  AND  QDDA  MECHANIZATIONS 

10.  0  GENERAL  DISCUSSION  OF  QDDA  COMPUTER  MECHANIZATION 
ALTERNATIVES  FOR  DIFFERENT  COMPUTATION  TASKS  -  The  major  problem 
of  quantitative  evaluation  of  computation  capacity,  in  relation  to  assumed  incre¬ 
mental  computer  mechanization  features,  was  attacked.  Before  proceeding  to 
the  quantitative  evaluations,  consider  a  number  of  heuristic  aspects  of  design 
associated  with  mechanization.  The  results  of  the  computer  application  and 
computation  requirement  study  demonstrate  that  the  various  airborne  and  aero¬ 
space  application  computations  fall  into  two  fairly  distinct  classes  with  respect 
to  their  computation  requirements  (associated  with  their  input  processing  and 
internal  computation  requirements).  The  Implication  of  this  result  is  that  the 
incremental  computer,  assigned  an  overall  computation  task  consisting  of  a 
given  set  of  computation  routines  all  of  which  fall  into  the  same  distinct  group, 
should  be  evaluated  for  distinct  mechanizations  meeting  the  pertinent  computa¬ 
tion  requirements  to  determine  the  most  economical  mechanization.  For  the 
case  where  the  computation  task  includes  computations  of  both  distinct  computa¬ 
tion  types,  a  deeper  design  problem  is  presented  in  mechanization  optimization. 
This  problem  is  encountered  in  the  development  of  a  computing  system  for  a 
full  aerospace  mission  requiring  input  processing  capabilities.  Thus,  the 
analysis  here  is  a  quantitative  step  in  the  ultimate  quantitative  function-mech¬ 
anization  analysie  of  computing  systems  for  full  mission.  The  computation  re¬ 
quirements  which  characterize  the  pertinent  classification  of  the  computation 
types  stem  from  the  occurrence  or  non-occurrence  of  variables  of  the  real-time 
computation  problem  which  vary  per  unit  time  in  such  great  magnitude,  in  rela¬ 
tion  to  required  resolution,  that  special  computer  design  features  are  required 


X-l 


to  mechanize  a  computer  which  can  execute  the  computation  task.  The  occur¬ 
rence  of  variables  of  this  type  are  said  to  present  the  rate  handling  problem  to 
DDA  operation.  The  special  computer  design  features  which  can  obviate  the 
rate  handling  problem  are  relatively  high  computation  iteration  rate  (as  obtained 
by  processing  efficiency  and  parallelity  and/or  high  clock- rate)  and/or  multi¬ 
increment  computation.  Computation  tasks  for  a  total  mission  which  do  not,  in 
the  normal  sense,  present  a  rate  handling  problem,  (computation  quantity  varia¬ 
tion  per  unit  time  does  not  directly  imply  unacceptable  resolution),  are  never¬ 
theless  typically  characterized  by  another  computer  design  problem  of  com¬ 
parable  proportion  to  the  rate  handling  problem.  The  second  problem  consists 
of  the  twin  problems  of  assuring  ability  to  execute  a  program  consisting  of  a 
relatively  large  set  of  different  computations,  and  that  of  long  term  computation 
accuracy.  Design  features  required  to  meet  what  may  be  called  the  computation 
capability  problem  are:  optimized  algorithm  and  multi-increment  computation 
for  precision;  processing  versatility  and  parallelity  for  efficiency;  and  high 
iteration  rate  for  problem  scope  and  accuracy.  Clearly,  from  the  standpoint 
of  abstract  processings  (apart  from  mechanisation  costs),  the  design  problems 
of  rate  handling  and  computation  capability  may  be  solved  together  in  the  design 
of  a  single  computation  ensemble.  In  the  practical  case  where  the  level  of 
mechanization  complexity  is  a  priori  assigned,  there  exist  mechanization  alter¬ 
natives  which  enable  handling  computation  tasks  involving  the  problem  of  rate 
handling  capability  in  the  one  case,  and  in  the  other  case  the  problem  of  computa¬ 
tion  capability  (the  latter  in  the  sense  of  not  including  in  the  fullest  degree  the 
problem  of  rate  handling  capability).  The  basic  reason,  in  the  case  of  incre¬ 
mental  computers,  for  the  possibility  of  distinct  mechanizations  for  best  meeting 
the  distinct  computation  problems,  lies  in  the  relation  of  mechanization  com¬ 
plexity  limitation  and  Iteration  rate  for  multi-increment  computation  in  conjunc¬ 
tion  with  the  relation  of  iteration  rate  and  computation  capacity.  A  mechaniza¬ 
tion  of  the  more  modest  complexity  level  which  achieves  multi- increment 


computation  doea  ao  only  at  reduced  iteration  rate  (in  conaequence  of  increaaed 
aerialized  operation)  which  in  turn  impliea  a  aource  of  reduced  computation 
capacity  in  conaiderable  degree  offaetting  the  aaaociated  ability  to  increaae 
accuracy  which  ie  implied  by  the  aelected  multi-increment  feature.  For  the 
computation  claaa  preaenting  a  computation  capacity  problem  rather  than  a  rate 
handling  problem,  the  allocation  of  hardware  totaling  the  a  priori  level  of  mech¬ 
anization  complexity  may  be  choaen  to  achieve  increaaed  proceasing  capacity 
while  meeting  readily  attainable  accuracy  requirementa  of  particular  application 
type  a  f  rather  than  aimply  the  pure  alternative  allocation  to  obtain  multi-incre¬ 
ment  computation.  On  the  other  hand,  computation  taaka  preaenting  the  problem 
of  rate  handling  are  executed  with  highest  performance  by  allocating  the  a  priori 
level  of  mechanization  complexity  to  multi-increment  deeign  in  consequence  of 
the  relatively  large  increase  in  rate  handling  capability  implied  by  multi- incre¬ 
ment  computation  relative  to  that  attainable  by  resorting  to  a  high  degree  of 
processing  parallelity.  In  conaequence  of  the  relations  of  mechanisation  com¬ 
plexity,  processing  structure  and  computation  accuracy  discussed  above  (and 
given  quantitative  description  in  the  following  section),  a  number  of  computer 
types  are  defined  and  analysed  and  evaluated  for  specific  tasks  occurring  in 
airborne  and  aerospace  applications.  In  particular  die  multi- increment  QDD*A 
involves  all  the  design  developments  of  the  contract  study  and  is  capable  of  all 
computation  tasks  investigated  in  the  applications  study,  hi  consequence  of  a 
degree  of  parallelized  processing  and  the  3  bit  transfer  features,  and  input  pro¬ 
cessing  capability  with  6  bit  increment  is  obtained  by  an  economical  modal 
switching.  Numerical  evaluations  for  the  latter  are  obtainable  by  the  formulae 
developed.  Examples,however,were  confined  to  internal  computation.  While 
certain  modest  computation  tasks  may  be  most  economically  mechanized  in 
alternative  forms  discussed  above  (actually  based  on  the  proposed  computer) 
and  analyzed  in  the  following  pages,  the  many  and  varied  total  mission  require¬ 
ments  can  only  be  met  by  a  computer  on  the  level  of  sophistication  of  the  pro¬ 
posed  parallized  multi-increment  QDD*  A  with  input  processing  capability. 


X-3 


10.  1  FORMULAE  FOR  COMPUTER  COMPUTATION  CAPACITY  AND  RATE 
HANDLING  CAPABILITY  IN  TERMS  OF  MECHANIZATION  FEATURES  -  The 
programmable  QDPU,  which  is  a  generalisation  of  the  DDA  integrator,  effects 
according  to  specific  mechanisation  features  a  given  number  of  B  bit  multi¬ 
transfers  within  the  cycle  of  its  action.  The  completion  of  which  is  followed 
by  the  next  processing  by  the  same  hardware  logic  according  to  the  generally 
modified  program  on  different  stored  information  with  different  inputs  of  what 
may  be  called  the  next  QDPU  of  the  block.  Depending  on  the  specific  design, 
a  type  of  average  performance  of  the  QDPU  for  an  application  or  set  of  applica¬ 
tions  may  be  evaluated  with  respect  to  the  average  number  of  conventional  DDA 
integrators  which  would  be  required  to  program  the  same  application  or  set  of 
applications;  this  average  performance  will  be  termed  the  average  integrator 
equivalent  N^  of  the  QDPU.  The  conventional  integrator  requires  one  word 
time  to  be  processed  whereas  the  QDPU  may,  according  to  chosen  mechanisa¬ 
tion  .require  Nw  word  times.  The  average  integrator  equivalent  per  word  time, 
R  *  N4/NW.  is  the  QDPU  machine  processing  rate  per  word  time  of  the  QDDA*. 
Actually,  computer  computation  capacity  is  best  measured  by  the  number  of 
different  integrators  equivalent  which  can  be  processed  with  required  accuracy 
for  a  given  application  or  set  of  applications.  Hence,  accuracy  performance  is 
closely  related  to  computation  capacity.  If  the  longest  tolerable  time  interval 
between  processing  the  same  QDPU  (or  DDA  integrator  is  1q  (or  T^)  such  that 
required  accuracy  is  obtained,  then  the  computation  capacity  C  for  QDDA  and 
conventional  DDA  is 


QDDA: 


VTw 


(X-l) 


♦Word  time  will  be  taken  to  be  determined  by  the  same  state  of  the  art 
hardware,  the  same  for  QDDA  and  conventional  DDA. 


X-4 


DO  A: 


(X-2) 


CD  -  1  •  T0/TW  '  7^ 


The  relative  computation  capacity  of  QDDA  (relative  conventional  DDA  without 
programmable  integration  algorithm)  is  the  ratio 


C 


relative 


V°D 


(X-3| 


The  importance  of  state  of  the  art  word  rate  in  determining  the  adequacy  of  a 
computer,  makes  absolute  computation  capacity  the  important  quantity.  The 
QDPU  information  storage  and  processing  configuration  developed  during  this 
contract  study  has  «  4. 3.  Mechanisation  complexity  determines  the 
selected  value  of  *  1.  2,  3,  4.  b  the  next  section  a  table  presents  esti¬ 
mates  of  iteration  interval  required  for  specified  accuracy  for  a  number  of 
applications  and  hypothesised  computer  types  based  on  the  theory  of  computer 
error  growth  derived  in  this  study.  The  above  mentioned  data  were  used  in 
computing  the  computation  capacity  of  various  computer  types*  and  applications 
for  the  stated  word  rate  (obtained  by  state  of  the  art  hardware).  These  results 
are  shown  as  follows: 


X-5 


Computation  Capacity  (Integrator!  Equivalent)  for  Application  Routine 
in  tiie  Case  of  1.4  x  10“  Word  Rate  (for  other  word  rates  multiply 
tabulated  values  directly  by  word  rate  ratio) 


n 

i 

© 

■  C  - 
-h  u 

o  g  X 

*  ^  ii* 

II  II  " 
<M  W 


T 

E 

o  B 


cl  « 

o  • 

VI  §  „ 

=  2  § 

II  II  *0 


•<4  n 

"  o 


o 

>»  . 

u  u  ie 

J3  i 
-  K  °  ° 

ii  $  *■  i* 
<M  CL  «■»  V 


of  2 

St" 

•  eti  Sd 

II  N  II 


u 

>•  S  ‘a*,i 

O  Ml  • 

V  Jj  — 


JS  H,  I 

•«  • 
k  U  >* 

>?  S 

^13- 


§  t  6 


er  §*«  • 

S  I  ?| 
1*0  06  * 


M 

C 

:  I 

o  o 

H  09 


* 

c 

o 

23 

c 

I 

o 

•3 

e 

o 

23 

c 

I 

O 


i 

c 

M 

s 

jj 

h 

2 


I 

J5 

c 

u 

o 

J» 

h 


i*  ^ 

s  £ 

Q,  M 

«  |4 

=  i 

o  o 
*  « 
5  1 
g,  - 
-  3 

1 


a 

J 


I 

1 

2 

3 

23 

k 

e 


1 

2 

3 

23 

k 


•  O 

•  k 

a  *  a 

i  •  g 

■g  -o  3 

3  %  1 


4 

o  g 

3  » 

o  c 

1 ! 
2  « 
h  5 

o  i 


k 

3 

0 

u 


23 

4 

as 

BS 

C 

3 

3 

o 

X 

e 


h 

! 

«r 

>. 

u 

a 

s 

O' 

k 


& 

« 


& 

I 


23  2  J2 


k 


H 


3  - 

&  £ 
Uf 
.  •  a 

e  a  a 

g  g  • 
9  2  S 


o* 

V 


i 


O'  3* 

e  e 
as  as 

S  & 
a  e 


N 


O' 

e 


2 

e 


g  4  g*  o 

T  .<  ft  © 


k 

O 


1 

23 

e 

I 


o 

£ 

<* 


INI 


O’ 


i 


? 

as 

a 

e 


-  A 

"  & 

C  3 

2  i 

23 

« 

k 

at 

2 
e 


•  E 

•5  £ 

8  2 


o* 

e 

as 

a 

G 


tr 

e 


5 

G 


co 

IM 

n* 

V 

as 

2 

£ 


O 

.o 


o’ 

e 


2 

5 


3* 


a 

G 


JO 

* 

E 


m  > 

•  **e 

-•  T5 


a  _  i 


u 

« 

k 

k 

o 

o 

I 

O 


o 

c 

8 

5 

•*1 

k 

O 


E 

5 

•ft 

k 

O 


0 

k 

0. 


*  3 
Etc 

5  I  ® 

2  8  • 


X-6 


1  bit  increment  QDOA  180.  integeq  210.  integ  eq  10,  000.  integ  2400.  in  teg  96.  integ  eq 


I* 

n 

*■3- 

fig" 

o  S 

-4  m  O 
4  •  £ 

a*  * 

<  5  * 

hi** 

°  S  S 

Hi 

1  s  * 

3-* 

«S$ 

■  K  u 

h  « 
O  -g  h 

%  S3 
&*  - 
2*  3 
52j 

>* "  -o 

<tC  3 

KOS 

0*3 

§  3 

■SO 

2  ** 

9  -c 

6.  - 

E  .5 

o 

o 


i  2  i 

1 

,  u  3 

S.  **  «0 

u  •  ? 

-Ill 

£  lit 


« -i: 

1  »*  2  — 

A  8 .2  « 


H* 

®.  2  - 

A  JL  & 


In 

it, ».  •«« 
iT-c  **  i 

ait 


So  e  o 
«oo 

-*  «  «  in 


•  •  .  o 

g  |  |  3 

rt  <fi  i  h 
N  N  N  N 

a  S  9  s 

£  £  £  £ 


3 

« 


a 


s  3 
9  9 


BESS 
•  •  •  • 
h  h  h  h 

Jill 

ill! 


•m 

■s . 

I- 

1* 

it 

S? 

Hi 

JS  «a  c 
U  U  £ 

•H  0  « 

g 

*  •  h 

M  ?  9 

•  -5  ■ 

5  O  B 

g  f  g 

l&i 

S.  I  *8 

ill 


5  15  1  111 


6 

: » i 
a|r 

h  4  9 
•  JB  *J 

**  •  4 

i.i 

ESI 

8=  S 
if  s 

it 
a? 
qta 

fb 
<  8  1 
Sis 

ill 


1 


X-7 


Rate  handling  capability  f  may  be  taken  to  be  measured  by  the  maximum 
frequency  f  of  variable  which  can  be  scaled  with  a  given  resolution  R.  Since 
rate  of  change  of  the  variable  in  one  iteration  time  determines  the  maximum 
resolution,  the  iteration  rate  of  the  computer  is  pertinent  to  rate  handling 
capability,  as  is  the  multi-increment  level.  Thus,  the  rate  handling  capa¬ 
bility  may  be  expressed  in  terms  of  the  maximum  frequency  of  sinusoid  cal¬ 
culable  with  given  resolution  which  is 


f  =  (2M-1)  /  2  er  R 
max 


(X-4) 


where  it  *  2  nf .  The  iteration  interval  is  actually  determined  by  required  com¬ 
putation  capacity  for  given  mission  accuracy  requirement.  However,  consider 

a  more  abstract  formulation*  of  relative  f  of  various  computers  which 

max 

simply  assumes  equal  integrator  equivalent  count  ignoring  long  term  accuracy 
performance  (this  abstraction  ignores  .for  example,  die  superior  algorithm 
accuracy  of  the  QOOA  relative  conventional  DDA).  hi  the  latter  case 


max 


Relative  Conventional 
DDA 


Q 


(X-5) 


where  t^/Tq  is  taken  now  as  relative  iteration  interval  for  the  same  intergrator 
equivalent  count  assuming  identical  word  rate  of  the  hardware,  hi  terms  of  the 
previously  defined  variables  of  mechanisation  alternatives. 


(X-6) 


♦In  contract  to  computation  capability  as  analysed  above,  regarded  as  the 
ultimate  quantitative  evaluation  of  computer  performance,  die  rate  handling 
capability  is  regarded  a  heuristic  quantity  in  this  analysis. 


X-8 


hence,  the  QDOA  has  relative  rate  handling  capability 


£ 

Relative  Conventional 
ODA 

which  ueing  Nj  -4.3  implies: 
e 


Single  increment  QDD*A 

with  N  *  1 
m 


3  bit  increment  QDD"A 

with  N  *  1 

a 


6  bit  increment  ODD*  A 
with  ■  2 


(2 


M 


f 

max 


£ 

max 


Relative  DDA 


Relative  DDA 


Relative 


DDA 


(X-7) 


4-3 


30.  (QDD*A  Single 
Precision  Mode) 


135.  (ODD* A  Double 
Precision  Mode) 


assuming  identical  integrator  equivalent  count  of  both  QDDA  computers- 

10. 2  PRECISION  LEVELS  RESULTING  FROM  ALGORITHM  OF  QDDA  AND 
CONVENTIONAL  DDA  -  The  precision  level  at  a  given  iteration  rate  is  de¬ 
termined  by  integration  algorithm  and  roundoff  mechanization.  The  errors 
magnitudes  produced  are  a  function  of  Iteration  rate  and  period  of  operation. 

The  conventional  DDA  does  not  have  programmable  algorithm  and  suffers 
thereby,  in  certain  computations,  integration  algorithm  errors  of  first  order; 
that  is*  the  deviations  which  would  result  if  trapezoidal,  new  y,  old  y  iteration 
algorithms  were  arbitrarily  permuted  throughout  a  computation.  DDA  com¬ 
putations  using  servo  modes  introduce  much  larger  errors  by  a  factor  f^  where 
f#  is  believed  to  be  about  4.  The  QDDA  having  programmable  algorithm  does  not 
have  integration  algorithm  errors  of  first  order*  but  instead*  errors  of  the 
second  order  from  fractional  approximation  error  8  of  the  mechanisation  of 


X-9 


••cond  order  integration  algorithm  terms  (for  9*0  the  algorithm  ia  good  to 
2°*  order,  for  0  »  1/2  the  algorithm  is  good  only  to  1st  order).  The  quantita¬ 
tive  error  levels  resulting  from  integration  algorithm  mechanisation  which  will 
be  typically  encountered  are 


Conventional  DOA: 


c  ■  1/2  ^  rtf  (f  ■  4  in  servo  mode, 

s  s 

t*  •  1  otherwise) 


(X-B) 


QDDA: 


c  ■  9m*  r*t 


(X-9) 


where 


■  «  2  trf,  f  *  frequency  of  computation  variable 
t  *  iteration  interval  between  computation  updating 
t  ■  time  interval  of  operation 

f  ■  factor  of  error  increaee  in  servo  modes  of  the  DDA 

s 


Another  kind  of  computation  error  ia  roundoff  error.  Conventional  DDA  com- 
patera  have  both  binary  and  ternary  mechanisations.  The  design  trend  is 
toward  ternary  which  has  one  half  the  roundoff  error  (aseociated  with  R  re- 
giater  content  relative  that  of  binary  mechanisation),  and  has  the  other 
accuracy  advantage  of  being  relatively  free  of  the  "phaee"  effects  which  occur 
in  binary  computation  (resulting  from  representation  of  aero  in  binary  as  the 
alternating  stream).  Evaluation  of  ODD* A  accuracy  will  be  relative  to  the  con¬ 
ventional  ternary  DOA  and  hence  tends  to  be  conservative.  A  principle  ad¬ 
vantage  of  the  ODD* A,  with  respect  to  roundoff  effects,  stems  from  the  much 
reduced  number  of  roundoff  error  sources,  especially  those  which  contribute 
the  largest  elements  of  error,  and  from  the  overflow  inhibitor  mechanism  for 
the  single  increment  QDD*A  and  otherwise  multi-increment  computation.  In 
absolute  count,  the  number  of  K  registers  in  tbs  ODD*  A  is  less  than  one-half 
the  count  of  the  DOA  for  the  same  application.  In  the  ODD* A  for  operations 


X-10 


not  involving  division  the  roundoff  error  is  reduced  by  a  factor  of  two  so  far  as 
R-register  count.  In  operations  involving  division  algorithm,  the  effective  re¬ 
duction  of  roundoff  error  is  much  larger  since  a  DDA  program  effecting  division 
involves  several  integrators  acting  either  as  a  servo  loop  subject  to  relatively 
large  errors  compared  to  ordinary  roundoff  error,  or  acting  as  Stieltjes  inte¬ 
grators  with  integrand  and  independent  variables  subject  to  spasmodic  error 
effects  comparable  in  magnitude  to  the  phase  error  effects  troublesome  in  a 
binary  DOA.  In  division  modes  (without  servo),  simulations  of  conventional 
DDA  show  that  errors  are  generated  which  have  a  behavior  markedly  similar 
in  magnitude  to  those  produced  by  integration  algorithm  errors  of  first  order; 
hence,  such  is  assumed  in  the  quantitative  estimates  below.  The  absolute 
magnitude  of  roundoff  errors  depends  on  scaling  of  the  computations  in  all  cases. 
In  general,  the  rate  limit  character  of  general  DDA  computation  forces  the 

minimum  granularity  to  be  at  least  as  large  as  that  consistsnt  with  maximum 

■  T  -M 

rate  of  change  of  the  variable;  hence,  jjy  *  2  is  die  minimum  rms  magni¬ 
tude  of  roundoff  associated  with  resolution  for  M  bit  increment  (assuming  a  flat 
distribution  of  error).  Evidently,  the  high  frequency  variables  of  the  computa¬ 
tion  task  are  the  most  subject  to  roundoff  error,  as  is  also  true  for  integration 
algorithm  error.  If  for  the  moment  we  regard  roundoff  error  as  a  kind  of 
noise  effect  in  computations  not  involving  division,  die  error  growth  factor  is 
deduced  to  be  Vn  =  t/r,  whereas  in  computations  involving  division  in  the 
conventional  DDA  the  error  growth  factor  is  n  »  t/r. 

The  attempt  to  explain  roundoff  error  magnitude  for  the  growth  (not  the  resolu¬ 
tion)  term  in  the  DDA  by  regarding  instantaneous  roundoff  error  as  a  random 
independently  distributed  variable  (such  that  the  integral  over  n  iterations  of 
instantaneous  roundoff  error  has  the  rms  value  of  ^  •J‘ n)  is  found  to  far  over 
estimate  the  error  which  actually  occurs.  The  reason  is  that  the  roundoff 
errors  forcing  function  series  in  time  doss  not  have  genuine  independent  com¬ 
ponent  errors,  but  rather  has  the  pruper%  that  their  sum  over  a  period  is  of 


X-U 


the  lime  order  of  magnitude  ae  the  typical  component,  a  consequence  of  the 
residue  retention  property  of  the  R  register.  In  a  system  of  integrators,  there 
can  occur  growth  of  roundoff  error  in  consequence  of  transient  states  of  round¬ 
off  error  during  which  the  system  sensitivity  to  error  forcing  function  is  first 
in  one  sense  and  then  the  other  in  phase  with  the  reversal  (having  resulted 
from  the  correlative  compensation  effect)  of  the  roundoff  series  sum  which 
leads  to  net  error  buildups,  rather  than  cancellation  over  the  subsequent  period. 
Thus,  in  sinusoid  calculation  the  system  sensitivity  oscillates  and  when  the 
roundoff  error  forcing  function  has  transient  oscillatory  components  of  the 
same  frequency  as  the  sinusoid,  a  net  error  growth  ensues  despite  the  long 
term  cancellation  of  the  sum  of  roundoff  error  forcing  functions.  Over  much 
longer  periods,  the  unsteady  length  of  the  correlation  period  would  produce 
random  long  term  phase  cancellations  implying  a  Vt  growth  (where  t  is  time) 
in  rms  error.  The  largest  resonance  effect  occurs  near  matched  frequencies, 
specifically  when  the  y  register  has  near  null  value,  since  the  time  interval 
for  correlative  reversal  of  roundoff  is  stretched  to  approach  that  of  the  system 
response  (computation  weighting  function).  As  an  exception,  there  may  be,  for 
short  registers,  other  periods  of  strong  resonance  by  chance  diaphantine  rela¬ 
tion  of  y  register  content  such  that  R  register  tends  to  be  biased  in  one  sense 

for  considerable  periods.  Scaling  for  maximum  resolution  presents  the  in¬ 
to  T 

stantaneous  roundoff  error  of  rms  magnitude  which  is  of  the  same  order 
of  magnitude  as  the  instantaneous  increment  of  1  •*  order  algorithm  error. 

Over  long  periods,  growth  of  roundoff  srror  may  be  associated  with  that  of  a 
random  noise  model  in  which  error  sensitivity  occurs  only  during  the  near  null 
states  of  the  y  registers.  The  resulting  formulation  is  approximately 


where  the  first  term  is  simply  the  resolution  for  optimum  scaling.  The  round- 
off  error  implied  by  this  formula  is  consistent  in  magnitude  with  DOA  sinusoid 
simulation  results  during  this  program;  its  formulation  is  consistent  with  the 
observation  that  sporadic  roundoff  effects  occurred  during  null  passes  of  the 
y  registers.  The  overflow  inhibitor,  developed  during  phase  1  of  the  contract 
study,  had  the  property  of  reducing  the  sporadic  phenomenon  by  a  factor  of  at 
least  3,  hence  has  been  chosen  as  a  design  feature  of  the  single  increment 
QODA.  The  reduced  R  register  count  reduces  the  error  another  factor  of  two. 

Summarizing,  the  roundoff  error  performance  of  QDDA  and  conventional 
ternary  DDA  implied  by  above  considerations  the  approximate  relations  are: 


No  division  calculation:  • 

Roundoff  Error  of  i 
Conventional  DDA 

t  Division  and  servo  calculation:  c 


[Single  increment: 
aver  flow  inhibitor: 


Roundoff  Errors  of^ 
QDDA 


(M  bit  increment  (M  *  3) 


,x"‘' 

JTj  Cl  + 
o»t  /  1  ®t 

F7T  V  1  6  n72  (X-  13) 


The  total  error,  including  integration  algorithm  and  roundoff  error,  is  (where 
component  errors  actually  enter  as  sum  of  squares,  but  to  save  space  in  later 
analysis,  are  entered  linearly  without  appreciable  error  for  present  purposes): 


Conventional  DDA: 


Ctotal 


No  division  «toul 

calculation: 


JIT 

ITS 

y*  +  *  'r3®t] 

Error  in  1**  order 
algorithm 

(X-15) 

•  T 

P1 4^* 

Programmable  1st 

(X-lb) 

ITT 

order  algorithm 

X- 13 


Division  It  servo 
calculation: 


*  total 


?73  C  1  +  fs  ^“t  1 


(X- 17) 


QDDAt 


Single  increment  fc 
overflow  inhibitor: 


*  total 


+  Tt 


(X- 18) 


Multi-increment 
(M  bit): 


Ctotal 


,  1  tut 
+  6  tt72 


(X- 19) 


For  typical  programs  of  conventional  DDA,  variables  which  go  through  sub¬ 
stantial  variation  during  operating  time  have  error  magnitude  of  the  order 


S^Tt 

*tot*lDDA  *  2  * 


(X-20) 


The  error  in  QDDA  from  integration  algorithm  ia  leee  than  thia  by  the  factor 
^  in  the  aingle  increment  machine,  and  in  die  multi-increment  machine. 

At  very  low  frequencies  the  QDDA  error  ie  mainly  roundoff  error,  typically 
of  the  order  •  2~**  * 

The  computation  capacity  ie  computed  in  terms  of  the  minimum  iteration 
interval  required  to  meet  specified  accuracy.  For  the  QDDA,  the  pertinent 
minimum  iteration  Interval  Tq  is  given  approximately  by  the  quadratic 

0  ■  -c  ♦  «tq  ♦  (X-21) 

v  «  ,-M  ”  1  et 

.k.„  •■^T  *  Vitj  •  stj 


X-14 


0  = 


2  (1  +  5u  (M  -  1+)) 


the  u  function  be  0  for  eingle  increment*  1  for  multi-increment.  The  eolutlon 
to  the  quadratic* 


tq  “H 


(X-22) 


where  e0  *  a3/40,  ie  simplified  in  two  ranges  of  specified  accuracy  separated 
by  e0  where  approximations  hold  based  on  (V  1 T  x  -  1)  »  j  for  e  <  <  t#  or 
(V  I  +  x  -  1)  m>  */x  for  c  >  >  c0  which  relate  physically  to  the  predominance  of 
roundoff  error  or  integration  algorithm  error.  Thue 


T*  2  -  for  e  <<  t.  Roundoff  Predominant  (X-23) 

U  A 


-Vi  tor  e  >  »  c„  (x.24) 

Q  v  8  e  Predominant 


where 


[1  +  9u  (14  -  1*)] 

6.22M 


These  relations  for  minimum  iteration  rate  of  the  QDDA  may  be  written  in  the 
form 


for  c  <  <  ce 

(Roundoff  Predominant)  ^ 


X-1S 


(X-26) 


where 


*  V  2c  (1  +  5k  (M  -  1+) 


for  e  > >  c. 

0 

(Integration  Algorithm 
Error  Predominant) 


.  ,  Cl  ♦  Sr  (M  -  1*)] 
6.22M 


The  relation  for  minimum  iteration  rate  of  conventional  serial  DDA  is 


(X-27) 


A  table  of  minimum  required  iteration  rate*  for  conventional  DDA,  1,  2,  3,  4 
bit  increment  QDDA,  has  been  calculated  for  several  markedly  different  appli¬ 
cations. 


*  The  values  express  requirements  for  accuracy  independent  of  attainability 
at  state  of  the  art  bit  rate. 


X-16 


X-17 


•Primary  error  source  is  integration  algorithm. 
••Primary  error  source  is  roundoff  error. 


X-1S 


•Primary  error  source  is  integration  algorithm. 
••Primary  error  source  is  roundoff  error. 


CHAPTER  XI 

MULTI-INCREMENT  QDDa  A  PROGRAMS  FOR  IMPORTANT 
APPLICATIONS  AND  PROGRAMMING  EFFICIENCY  EVALUATION 

1 1 . 0  THE  GENERAL  LEVEL  OF  COMPUTER  SOPHISTICATION 

REQUIRED  FOR  FULL  AEROSPACE  MISSION  -  The  general  level  of  com¬ 
putation  capability  and  associated  mechanization  complexity  required  of  a 
computer  for  a  full  aerospace  mission  is  determined  by  the  program  task 
and  state-of-the-art  bit  rate.  The  effort  here  is  to  evolve  an  incremental 
computer  in  a  GP-DDA  computer  system  which  takes  over  the  bulk  of  the 
computation  tasks  of  the  aerospace  mission  and  leaves  the  remaining  func¬ 
tions  to  a  low  cost  GP  with  slow  multiplier.*  The  proposed  approach  is  an 
allocation  of  the  computation  tasks  for  a  computer  system  of  maximum 
efficiency,  based  on  the  individual  potentialities  of  the  two  computer  types. 
Evaluation  during  this  study  of  the  explicit  computation  programs  for  a  full 
(or  almost  full)  aerospace  mission,  evolved  in  other  contract  efforts  at 
Litton  Systems,  Inc. ,  demonstrates  that  an  incremental  computer  system, 
of  the  remarkable  new  type  evolved  in  this  contract  study,  must  be  capable 
of  carrying  out  a  program  (the  extent  of  which  is  expressed  in  conventional 
DDA  integrators)  for  400  to  500  DDA  integrators  if  no  branching  is  mech¬ 
anised*  and  250  DDA  Integrators  with  branching.  Significant  portions  of 
such  a  system  require  new  levels  of  computation  accuracy  relative  to  ex¬ 
isting  DDA  computers.  Hence,  efforts  toward  evolving  serial-parallel 
computer  structure  of  the  QDD*A  were  concentrated  on  mechanisations 
having  a  complexity  intermediate  between  conventional  DDA  and  an  air¬ 
borne  GP  computer.  By  virture  of  basic  developments  in  multi-increment 
computation  and  modal  action  of  computer  processing  structure,  significant 


•Early  efforts  in  the  aerospace  computer  field  have  apparently  confined 
consideration  to  computing  systems  in  which  a  conventional  DDA  is  not 
capable  of  executing  a  substantial  part  of  the  program  task  and  requires 
a  large  GP  with  clock  rate  pressing  the  state-of-the-art. 


XI- 1 


reductions  in  computer  complexity  for  a  given  level  of  computation 
capability  were  recognized  as  attainable.  The  system  design  task  of 
exploiting  these  potentialities  for  an  incremental  computer  capable  of 
assuming  the  major  tasks  of  a  full  aerospace  mission  involves  a  combination 
of  application  programming  and  mechanization  studies.  The  former  are 
given  major  emphasis  in  this  chapter.  The  quantitative  analysis  of  compu¬ 
tation  capability  as  a  function  of  design  features  of  Chapter  X  and  the  pre¬ 
ceding  investigations  of  general  input  processing  mechanizations  require¬ 
ments  for  state-of-the-art  bit  rates  imply  a  computer  with  paired many-bit 
transfer  and  double  paired  several-bit  transfer  mechanization  alternatively 
programmed  in  a  time  shared  arithmetic  structure. 

Programmable  modal  action  of  the  QDDaA  was  developed  in  closely  parallel 
investigations  of,  on  the  one  hand,  application  computation  requirements, 
and  on  the  other,  development  of  digital  processing  and  arithmetic  modal 
design  features.  In  this  chapter,  explicit  programs  for  important  applica¬ 
tions  are  developed  and/or  evaluated  in  order  to  evaluate  the  high  compu¬ 
tation  capability  of  the  product  for  its  intended  application.  The  discussion 
stresses  that  application  requirements  are  specifically  met  by  basic  oper¬ 
ations.  A  preceding  chapter  presented  a  description  of  the  digital  proces¬ 
sing  and  arithmetic  modal  design  techniques  which  make  the  basic  operations 
possible  in  a  computer  of  efficient  hardware  mechanization.  Thus,  here  we 
assume  the  programmability  of  modes  of  single  and  double  precision,  high 
and  intermediate  iteration  rate,  (which  make  the  general  computations  of 
input  processing  and  internal  computation  types  possible  with  the  same 
array  of  flip-flops  time  shared  and  continuously  fully  worked)  to  achieve 
maximum  computation  capability.  In  internal  computation,  where  precision 
is  traded  to  achieve  effectively  high  speed  computation  by  computation  oper¬ 
ation  sophistication,  we  assume  the  programmability  of  transfer  and  algori¬ 
thm  action  defined  by  the  QDO*A  program  code  in  Chapter  XII. 


XI -2 


The  major  application  for  which  QDDaA  programmability  i«  evaluated  ie 
the  full  aeroapace  miaaion,  including  the  many  phaaea  of  guidance, 
allowing  for  undetermined  auxiliary  functiona  of  military  context  conceived 
likely  within  the  coming  decade  in  order  to  provide  a  baaia  for  aaaurance 
that  the  latter  expected  additional  functiona  may  be  within  the  computation 
capability  of  the  computer(clearly  beyond  the  capability  of  any  exiating 
airborne  incremental  computer)  when  added  to  the  already  large  guidance 
and  control  program.  Certain  other  computationa  are  evaluated  which 
are  known  to  challenge,  or  clearly  exceed  in  certain  reapecta,  the  capa- 
bilitiea  of  conventional  computera.  Thua,  the  important  computaticna  for 
doppler  damping  in  airborne  navigation  are  preeented  becauae  aatiafactory 
damping  accuracy  ie  recently  known  to  be  unaatiafactory  on  the  baaia  of 
flight  teat  (and  aleo  analyaia  of  Chapter  X).  The  reaeon a  for  thia  particular 
failure  atema  from  a  combination  of  ainuaoid  generation  inaccuracy  for 
rapidly  changing  craft  anglea  and  accuracy  deficiency  of  divieion  in  the 
conventional  DDA.  Clearly,  certain  military  applicationa  euch  aa  fire 
control  may  preaent  to  an  aeroapace  application  of  the  future  eimilar 
computational  demanda  ae  thoee  of  doppler  damping  and  toae  bombing. 

Input  proceeeinge  for  the  caae  of  atrap-down  computationa  and  midcouree 
guidance  are  analysed. 

The  relations  of  input  processing  and  Internal  computation  ae  computation 
types  (general  concepts  for  which  were  developed  during  phase  I)  shall  be 
delineated  in  terms  of  later  developed  modal  design  principles  and  mech¬ 
anisation  features.  The  essence  of  input  proceesing  requirements  is  the 
requirement  for  many-blt  increment  computation  and  high  iteration  rate, 
in  respective  degrees  to  attain  a  required  resultant  capacity.  The 
developed  mechanisation  features  for  the  many -bit  increment  computation 
ie  programmable  at  high  or  intermediate  iteration  rate;  the  iteration  rate 
being  assignable  provided  program  total  is  not  excessive.  The  mechanisa¬ 
tion  thia  obtains  is:  (1)  many -bit  increment  computation  (double 


XI- 1 


precision)  at  high  rate  to  be  selected  for  the  most  demanding  routines; 

(2)  many-bit  increment  at  intermediate  rate  for  somewhat  less  demanding 
routines,  which  it  should  be  added,  tend  to  be  too  long  to  be  performed  at 
a  very  high  rate;  and  (3)  several-bit  increment  computation  at  high  rate 
for  computations  requiring  rapid  decision  such  as  thrust  cutoff  after  mid¬ 
course  guidance.  The  several-bit  increment  computation  at  very  high 
rate  is  entirely  adequate  for  the  beet  pulse  stream  transducers.  Because 
of  increased  parallel  computation  of  the  QDPU,  over  that  in  double  preci¬ 
sion  mode,  high  iteration  rate  is  made  compatible  with  attaining  a  satis¬ 
factory  intermediate  iteration  rate  due  to  reduced  time  sharing 
requirements;  (4)  several-bit  increment  computation  at  intermediate 
iteration  rate  for  typical  internal  computations.  The  input  processing 
computations  programmed  and  analysed  in  detail  in  this  applications  study 
may  be  listed  as  follows : 

Input  Processing  Programs 

High  Rate  Single  Precision: 

(1)  Thrust  Cutoff:  For  full  aerospace  mission 

(2)  Strap-down:  For  full  aerospace  mission 

Intermediate  Rate  Double  Precision: 

(1)  Sine  and  cosine  computations  on  Air  Data  Inputs  exemplifies 
re-entry,  fire  control,  digital  autopilot  computation  subroutines 

(2)  Doppler  Damping  of  inertial  systems:  Enables  long  term  airborne 
navigation  accuracy. 

The  problem  of  handling  900  to  1000  of  the  computer  system  (GP-DDA)  task 
in  the  incremental  computer  is  the  two-fold  problem  of:  (1)  computation 
capability  for  well  behaved  functions,  and  (2)  evolving  decision  mode 
mechanisation  and  programming  techniques  for  handling  discontinuities 
and  singularities  (the  latter  being  usually  a  result  of  coordinate  eystem). 


Xl-4 


The  firat  pert  of  this  problem  ia  reaolved  by  the  computer  deaign 
principlea  developed  during  phaae  II  and  apecifically  by  the  QDD3A.  The 
aecond  problem,  tlleciilM  of  tflart  in  MitbiWAUniUMiillfMvhMAJNPtPt 
thia  atudy,  deaervea  intenae  atudy.  The  preliminary  atudy  made  of  thia 
problem  indicatea  that  it  ia  not  inaurmountable  and  can  be  eventually 
achieved  in  relatively  efficient  mechanization. 

11.1  QDD8 A  PROGRAM  ANALYSES  AND  COMPUTER  SYSTEM  DESIGN 
BASED  ON  COMPUTATION  REQUIREMENTS  FOR  AEROSPACE  GUIDANCE 
AND  CONTROL  APPLICATIONS 

A.  INTRODUCTION  -  In  order  to  aaaure  that  computer  deaign 
offered  in  aatiafaction  of  thia  etudy  contract  impliea  a  mech¬ 
anization  capable  of  aerving  the  total  requirementa  of  an 
orbital  mieeion.  the  eat  of  computations  preaented  in  "Formu¬ 
lation  of  Guidance  and  Control  Equatione.  Their  Mechanization 
and  Inatrumentation"  (October  1961)  by  W.  J.  Jacobi  and 
C.S.  Bridge  have  been  eelected  ae  the  minimal  computation 
requirementa  thereof.  The  orbital  miaaion  ia  poatulated  to 
have  8  modea : 

1.  Booat  Phaae 

2.  Coaat  to  Apogee  (Tranafer  Orbit) 

3.  Injection  into  Orbit 

4.  Orbit 

5.  Retrofire 

6.  Tranafer  Orbit 

7.  Re-entry  (Development  of  Aerodynamic  Forcea) 

8.  Manuevering  and  Landing 


XI*  S 


The  computation  requirements  delineated  in  the  pertinent  paper 
consider  a  guidance  and  control  system  comprised  of  the  following: 

1.  Inertial  Platform 

2.  Central  Digital  Computer 

3.  Environment  Sensors  and  Buffers  for  Communication 
with  the  Outside  World 

A  central  digital  computer  is  presented  minimal  requirements  for  the 
full  mission  as  a  result  of  as  sumption  ( 1)  above;  thus,  it  it  is  assumed  that 
strap-down  navigation  (without  inertial  platform)  is  implemented,  or  if 
functions  other  than  guidance  and  control  are  to  be  implemented,  then 
the  computer  system  must  have  a  computation  capacity  in  excess  of  that 
implied  in  the  pertinent  paper.  Major  system  output  functions. for 
guidance  and  control  are: 

1.  Velocities  in  earth  fixed  and  space  fixed  coordinates 

2.  Altitudes  above  earth's  mean  surface 

3.  Angles  relative  to  local  vertical 

4.  Angular  position  in  earth  fixed  coordinates 

5.  Automatic  flight  control  information 

6.  Energy  management  parameters 

The  multi-increment  QDD  A  is  designed  to  have  a  QDPU  (quotient  dif¬ 
ferential  processing  unit  as  a  generalisation  of  the  conventional  DDA 
integrator)  capable  of  an  unparalleled  level  of  speed  and  precision  in 
performing  elementary  incremental  computations  such  as  multiplication, 
vector  resolution,  division,  ~jj  x*  ♦  y*,  scaling,  integration,  sinusoid 
generation,  and  other  DDA  computations  implied  in  the  above  functions 
(1)  through  (5)  in  an  optimised  mechanisation  derived  by  analysing  the 
relations  of  elementary  computation  requirements  and  associated  hard¬ 
ware  requirements.  While  the  QDD*  A  design  evolved  is  programmable 
for  any  set  of  incremental  computations,  the  specific  program  for  (1) 


XI-6 


through  (6)  is  largely  blocked  out  for  evaluation  of  the  QDD^A  computation 

capacity,  in  relation  to  existing  incremental  computers,  and  for  potential 
2 

application  of  QOO  A  mechanisation. 

B.  Missile  Velocity  Computation  with  Scale  Factor,  Linear  Drift 
and  Bias  Error  Correction  of  Inputs  and  Outputs  of  the  Digital 
Computer  -  In  order  to  achieve  the  accuracy  required,  the 
computer  may  be  used  to  compensate  for  various  scale  factor, 
linear  drift,  and  bias  errors  of  input  sensors  and  computer 
commanded  transducers.  An  internal  computation  with  input 
processing  supplied  inputs  may  also  be  required  to  effect 
scale  factor  changes.  An  inertial  system  with  a  physical 
inertial  platform  employing  accelerometers  subject  to  known 
individual  errors  is  torqued  to  desired  orientation  by  torquers 
subject  to  known  individual  errors.  Execution  of  the  desired 
error  corrections  may  be  shown  possible  carrying  out  the 
following  calculations: 

AVX  ■  AVX  (SFX)  +  bjjAt  (XI-1) 

AVy  «  tVy  (S Ty)  ♦  bydt 
AVg  »  AV,'(SF.)  ♦  b*At 

•xdt  ■  SfeAt  (SFjt)  ♦  bj^At 
»yAt  a  ttydt  (SFy)  ♦  by^t 
•yAt  a  sigAt  (SFg)  +  bgAt 


Xl-7 


where  AVjti  y>  z,  cuj(|  y>  z  At  are  input  and  output  quantities 
respectively.  The  conventional  ODA  requires  6  integrators 
to  effect  the  scale  factor  corrections  and  6  integrators  to 
effect  the  drift  corrections  of  accelerometer  inputs  and 
angular  rate  outputs.  In  the  ODD3  A  these  corrections  are 
blended  with  velocity  computations  to  effect  a  marked  increase 
in  computation  capacity.  Before  evolving  the  QDDaA  program, 
consider  the  nature  of  the  velocity  computation  best  suited  to 
incremental  computation. 

The  velocity  computation  apart  from  scale  factor,  linear 
drift,  and  bias  correction  of  inputs  and  outputs  is  derived 
from  the  general  expression  for  acceleration  in  a  rotating 
coordinate  system, 

«s  •  •  • 

A  »  R  ♦  «XR  +  2wXR  +  aX(siXR)  -  g  (XI-2) 

where  f  ie  the  gravity  vector.  Substituting  R  ■  kR  in  the  above 
equation  and  breaking  the  resulting  vector  equation  into  its  com¬ 
ponents  in  die  x»  y,  a  directions  yieldst 


Ajj  »  ayR  ♦  2s»yR  ♦  ttgUgR  +  gx 

(XI- 3) 

Ay  »  -«icR  -  2juxR  ♦  «y«sR  +  gy 

(XI-4) 

•  e 

A«  ■  R  -  (a**  ♦  ®y* )  R  v  gs 

(XI -5) 

XI -« 


These  equations  may  be  re-written  in  the  following  form  suitable 
for  computation  by  the  QDDA*, 


tv 


tv 


tv 


nT 

nT 

n*r 

Input 

r 

p 

m  V  dt  +  J 

i 

•  t  - 

j 

tv  \  dt  - 

J 

X 

n 

(n-lf 

Y  *  (n-l)T 

*  y 

(n-lf 

nt 

nr 

nr 

Input 

f 

p 

is  V  dt  -  J  „ 

f 

■  tv  + 

J 

(U  V  dt  - 

J 

yn 

(n- 1  )T 

x  a  i  i vr 

(n-1) 

a  x 

(n— 1 )~ 

nT 

nt 

nr 

Input 

r 

r 

r 

■  tv  - 

j 

tu  V  dt  ♦  J 

III  V  At  l 

J 

a 

(n-l)T 

*  y  (n-l)T 

y  * 

(n-lf 

(XI- 6) 


(XI-7) 


(Xl-8) 


Interpreting  LV  as  the  scale  factor,  linear  drift  and  bias 
x*y*» 

corrected  input  quantltiee  generated  from  actual  inputs*  the 
above  equations  and  the  correction  equations  presented  previously 


tV  ■  AX  +  +  b  At 

*  X  X  X 


ot 

P 


J  _«xdt 


(o-lf 


(XI-9) 


nt 


tV 


t\  ♦  Au  ♦  b  At  -  [*  g  dt 

y  y  y  J  _  *v 


yn  y  y  y 


(»-D 


(XI- 10) 


tv 


XT 

r 


t\  ♦  *1,  ♦  batt  -  j  -gadt 


(n-1) 


(XI-  U) 


•But  which  will  be  farther  elaborated  for  scale  factor,  linear  drift 
and  bias  correction  of  inputs  and  outputs. 


XI- 9 


where 


AX 

X 

n 

«  (SF  )  AV»  -  A 

X  X 

[  V  d 

1  z 

i* 

|  in  dt 
J  y 

AX 

yn 

*  sv;  - 4  J 

[  V  d 

I  X 

Jv« 

AX 

z 

■  <SV  -  6  j 

1  V  d 

1  y 

J  "«d* 

Am  ■  A  |  V  d  I*  ou  dt 
x  J  y  J  ■ 

AM  «Afvdrujdt 
y  J  z  J  x 

i*  ,* 

Am  «  A  |  V  d  |  oj  dt 
z  J  x  J  y 

The  scale  and  bias  corrected  angular  rates  will  also  be  computed 
in  parallel  QDPU  action, 


k 

« 

f  •;*  *  * 

J  <sr>  »xd.  ♦  4  J 

’  b'  dt 

X 

(XI- 12) 

k 

4 

[  •;*  •  * 

Jisriayit  *  a  J 

*  b*  dt 

y 

(XI- 13) 

k 

f  eu*  dt  »  k 

f  (SF' )  to  dt  +  k  I 

1  b' dt 

(XI- 14) 

f 

1  E 

J  E  Z  J 

l  z 

quantities  AV* 
*1 

^  are  computer  inputs. 

The  quantities 

a  dt  are  generated  in  another  routine  which  is  developed. 
*x,  y, s 

(n-1) 


The  QDPU  processing  configuration  for  generation  of 
A  Xxn>  A  J  ojydt  is  typical  for  these  types  of  terms  and  is 
shown  in  Figure  11-1. 

Three  QDPU  generate  A  2VX,  A  2Vy  ,  A2Va  and  A  2  J  wxdt, 

A  2  Jutydt,  A  2  J  u^dt  given  A  2fix,  A2My,  A2^  and  A2  Jgxdt, 

A  2  Jgydt,  A  2  J  gc  dt  the  latter  being  generated  elsewhere. 
The  linear  drifts  are  corrected  by  initial  settings  of  the 
proper  "delta"  registers  of  the  QDPU.  A  conventional  DDA 
requires  a  word  time  to  compute  drift  corrections  typically 
accomplished  in  a  quantisation  integrator.  The  two  scaling 
operations  do  the  work  of  two  additional  DDA  integrators. 

One  of  the  cross  product  terms  of  the  velocity  computation  is 
alac  computed  in  the  same  QDPU  (see  the  Vs  A  J  Ug  dt  trans¬ 
fer  in  Figure  11-1.  Thus,  the  QDPU  of  the  diagram  does  the 
processing  of  5  DDA  integrators.  As  a  result  of  there  being 
involved  in  the  sub- routine  only  2  R  registers  instead  of  5  R 
registers  as  in  the  conventional  DDA,  and  as  a  result  of 
multi -increment  computation  accuracy  instead  of  single 
Increment,  the  accuracy  of  computation  of  the  QDPU  is 
greatly  increased.  Since  3  of  the  6  cross  product  terms  are 
generated  in  the  QDPU  of  the  type  in  Figure  11-1,  the  re¬ 
maining  3  must  be  computed  elsewhere.  By  selecting  3 
QDPU,  programmed  to  carry  out  computations  involving  other 
V*,  y,  a  or  !  «x,  y.sdt  terms  the  computation  of  all  ft  terms 
in  such  parallel  assignment  can  amount  to  a  total  of  less  titan 
1  QDPU.  In  summary,  it  is  seen  that  velocity  computation 
and  generation  scaled-drift  corrected  inputs  and  outputs  which 
require  18  integrators  in  a  conventional  DDA  are  executed  in 
the  QDDA  with  J.  75  QDPU,  the  QDDA,  in  effect,  executing  the 


xuii 


A2  X. 


XI- 12 


figure  11-1.  QDPU  Proceaainit  Schematic 


XX-  1J 


Figure  11-2.  Missile  Velocity  Calculation  with  Scale  Factor: 

Linear  Drift  and  Bias  Error  Correction  of 
Inputs  and  Output  of  the  Digital  Computer 


computation  4.  8  times  faster  than  the  conventional  DDA.  The 
flow  diagram  of  the  routine  is  presented  in  Figure  11-2. 


Computation  of  Flight  Path  Angle,  Velocity  Relative  to  the 
Fixed  Pseudo  Coordinate  System,  Radial  Distance  and  Lift- 
Drag  Ratio  Computation  for  Re-entry 

1.  Routines  Investigated  -  One  of  the  critical  quantities  that 
must  be  known  upon  starting  re-entry  is  the  flight  path 
angle.  The  total  velocity  relative  to  the  fixed  pseudo 
coordinate  system  is  determined  by 


V 


(XI- 15) 


These  two  calculations  may  be  executed  with  high 

efficiency  in  the  parallel-multi-increment  QDDA  making 
r  *  2" 

use  of  the  x  ♦  y  computation  capability  of  one  of  the 
two  parallel  channels.  Thus,  the  intermediate  quantity 
u  given  by 


u 


(XI- 16) 


is  used  in  both  the  velocity  computation  given  by 
V  * 


-fJTj 


(XI- 17) 


as  well  as  flight  path  angle  computation. 


The  lift-drag  ratio  is  another  important  quantity  which 
can  be  determined  by  either  pure  inertial  means  or  by 
air  data  measurements.  The  inertial  method  depends 
upon  measurements  from  body  mounted  accelerometers 
and  calculation  of  the  true  angle  of  attack.  The  equa¬ 
tions  required  are  as  follows : 


l/d 


A  tin  a  -  A  cos  a 
x _ t  z _ t_ 

-A  cos  a  -  A  sin  a 
X  t  z  t 


(assumes  0 
angle  of  bank) 


where 


o  =  0  -  y  (0  =  elevation  angle  as  read  from 
the  inertial  platform 
Y  =  flight  path  angle  computed  in 
previous  routine) 

The  Ax,  Az  are  whole  word  inputs  to  the  digital  com¬ 
puter.  QDDaA  inputs  may  alternatively  be  mechanized  as 
Ax,  Az  directly  or  .  dAx,  dAz  the  latter  formed  by  special 
input  differencing 

2.  Program  for  Mechanization  With  Double  Integration 
Mode  -  Intermediate  phase  investigations  of  programs 
for  sub- routines  for  mechanization  with  a  double  integra¬ 
tion  mode  were  carried  out  for  the  pertinent  calculations 
to  evaluate  the  special  mode  as  well  as  general  versatility 
of  the  QDD*A.  While  portions  of  the  program  assuming 
the  double  integration  mode  do  not  involve  it,  they  arc- 
presented  together  because  in  the  later  studies  in  which 
the  special  mode  was  discarded  (on  later  evaluation  of 
merits  and  demerits)  they  t  re  most  efficiently  programmed 
(and  when  the  double  precision  mode  was  developed  more 
precisely  computed)  as  a  single  coupled  routine  which 
may  be  compared  with  results  presented  here.  Assuming 
the  double  integration  mode,  a  program  for  the  calcula¬ 
tions  is  presented  in  diagrams  on  the  following  two  pages 
with  a  detailed  QDPU  mode  diagram  presented  on  the 
third  page.  The  routine  which  requires  a  total  of  29  DDA 
integrators  (in  29  word  times)  in  a  conventional  DDA  is 


XI*  lb 


PROGRAM  I 


A  QDPU 

MODAL  PROCESSING  FOR  THE 
LIFT-DRAG  COMPUTATION 


A(x  £a») 


6  Integrator  Equivalent 
in  1  Word  Time 


XX-  It 


executed  by  6  QDPU  (in  6  word  time*)  implying  1  QDPU  * 

4  5/6  DDA  integrators. 

3.  Program  for  Mechanisation  Without  Double  Integration 
Modes  -  The  QDPU  program  calculations  and  schematics 
for  mechanizations  without  double  integration  mode  are 
designated  Program  II  and  presented  on  the  two  pages 
following  Program  I.  It  is  seen  that  the  work  of  28  DDA 
integrators  is  accomplished  by  6  QDPU  without  double 
integration  mode  hence  1  QDPU  \  4  2/3  DDA  integrators. 
This  result  and  other  subroutine  analyses  as  well  as  two 
other  important  considerations,  led  to  the  decision  to 
discard  the  double  integration  mode. 

The  sinusoid  computation  is  one  of  the  most  error  sensitive 
of  important  DDA  computations.  With  correct  integration 
algorithm  the  major  limitation  in  accuracy  stems  from 
roundoff  error.  Assuming  3  bit  increment  computation, 
the  level  of  accuracy  attainable  in  sinusoid  at  intermediate 
iteration  rates  on  external  input  angles  is  certainly  an 
order  of  magnitude  higher  than  attainable  with  a  conven¬ 
tional  DDA  with  correct  integration  algorithm.  Computa¬ 
tion  capability  analysis  places  the  improved  performance, 
however,  as  still  one  of  the  limits  in  overall  system 
performance.  On  development  of  the  double  precision 
mode  feature,  it  was  seen  that  this  limitation  could  be 
removed  without  cost  in  integration  rate.  This  assumes  that 
double  integration  mode  is  mechanised,  since  the  sinusoid 
requires  1  word  time,  at  least,  in  a  two  output  QDPU 
without  double  integration  capability.  The  final  deter¬ 
mining  factor  in  the  decision  to  not  mechanise  a  double 
integration  mode  is  that  the  word  time  of  the  QDPU 


W-ll 


XI- 18 


ux  i»  one  of  the  intermediate  quantities  in  the  velocity 
vector  computation  described  in  the  preceding  section. 


Program  I  Part  B  Mechanization  With  Double  Integration  (Assumed) 
Multi- QDDA  Computation  of 
Lift-Drag  Ratio  For  Re-entry 
Body  Mounted  Accelerometer  Inputs 


dtt{  d*  sinfl(t  d«t  da  cos  at  dflf^  d  (x  da^) 


is  accomplished  by  multi-increment  QDDA  by  calculation  sets: 


da.  -  (d9  -  dy) 


(XI- 19) 
dv-(L/D)  d> 

it  ■  -  cos  a.  d  A  -  sin  a.  d  A  ;  dt)  *  sin  a  d  A  -  cos  a.  d  A  ;  d  (L/D)  *  v 
*  tx  tx  tx  tz  y 


d  sin  a{  *  cos  at  dtt( 
d  cos  at  «  -sin  d#t 


(XI- 20) 


d  sin  Ot  •  cos  dO{ 
d  cos  ttt  •  -sin  da 


dy  *  xdtt  ♦  d£ 


dx  -  -yd  a(  ♦  'll) 


where  x,  y  are  defined  as 


x  *  A%  sin  ttt  -  A^  cos  af 

y  *  -A  cos  a.  -  A  sin  a. 

’  X  t  Z  I 


XI-  1** 


XI- 20 


Double  Precision 


Program  II  Mechanization  Without  Double  Integration  Mode 
Re-entry  Calculation*:  Flight  Path  Angle, 

Total  Velocity,  Lift  -  Drag  Ratio 
(Multi-Increment,  With  Double  Precieion  Mode) 


[u4u] 

[VdV] 


V  dV  ♦  V  dV 

xx  Y  Y 

V  dV  ♦  V  dV  t  V  dV 

z  z  xx  y  y 


Cudu] 


udV  -  V  du 
X  X 


d  a  in  *  coe  ft^  dtt( 


d  coe 


Double  Precieion 


*  ein  aig  d®t 


d  (A^  ein  Ct^)  =  ein  d  A^  ♦  A^d  ein 

d  (A^  coea()  =  coe  d  A^  ♦  A^  d  coi 

d  (A  ein  ft)  *  tin  ad  A  ♦  A  d  ein  ft: 

z  t  tax  t 

d  (A^  coe  ft()  =  coe  d  Az  ♦  A(  d  coi  a( 


d  (Wn) 


dL*  -  <l/d>  4D* 


.  [VdV] 
1  QDPV 


•  4i/3 


DD A  Integrator* 


Xl-Zl 


(which  la  programmable)  must  be  increased  50  to  100^ 
over  the  word  length  determining  desired  resolution. 

Thus  the  double  precision  mode,  which  requires  only  10 
to  15£  longer  word,  actually  equals  the  net  performance 
counting  integrator  equivalent  and  iteration  rate  together, 
without  further  elaboration  of  the  computer. 

Geographic  Coordinate  Computation  for  Aerospace 
Navigation  and  Evaluation:  Sinusoid  Computation  Modes 
for  Alternative  QOPU  Designs 

a)  Non- Polar  Flight 

The  primary  navigation  for  the  aerospace  mission  is 
done  in  a  pseudo-coordinate  system  for  compatibility 
of  the  outputs  with  desired  orbital  display  quantities. 

Some  of  the  terms,  however,  such  as  gravitational 
effects  are  functions  of  the  geometric  coordinates. 

In  addition,  during  the  re-entry  and  landing  phases  it 
is  most  convenient  to  calculate  "geographic  coordinates". 
The  "pseudo"  coordinates  make  for  ease  of  computation 
in  many  of  the  navigation  routines  other  than  the 
coordinate  conversion.  The  computation  involved  in 
coordinate  conversion  where  flight  very  near  the  poles 
is  excluded  may  be  discussed  in  terms  of  the  explicit 
form: 

•  ■  arc  sin  t  cos  $  sin  $  sin  B  ♦  sin  $  cos  B  ]  (XI- 

r  cos  dcos  6 1 

■  k  -arc  cos  l  - r.., ,  ,  j 

u  cos  # 

9  «  geocentric  longitude 

k  ■  geocentric  longitude 


*  geographic  latitude 
X^  *  geographic  longitude 

The  given  dfi,  dt,  ein  tl>,  coe  0more  general  caee  for  flight 
through  the  polea  will  be  diecueeed  after  the  simpler 
computation  programming  problem  is  delineated  and  QDDaA 
programming  performance  stated  in  the  next  set  of 
schematics. 

5.  Complete  Geographic  Coordinate  Calculations  of  the 
Proposed  QOPU  (Multi-Increment.  Without  Double 
Integration  Mode)  -  Figure  11-3  ahows  the  geographic 
coordinatee  of  the  following  calculations. 

Calculations:  0  =  arc  sin  ^os  8  sin  8  sin  0e  ♦  sin  (I*  cos  /J0^  (XI- 22) 

,  %  rcoa  x  cos  8 1 

X  =  X  -  arc  cos  ■  — . — 

n  L  cos  0  J 

0  *  0  *  €  ein  10 

g 

X^  *  X  ♦  where  0^,  X^  «  geographic  coordinates 


Inputs: 


dag,  d*  cos  $,  d*  sin  8 


(XI- 23) 


Differential  Relatione  used  in  each  QDPU  and  DDA  integrator  equivalents: 


d  0 


sin  0gd  (cos  0  sin  6)  4  cos  d  sin  0 
cos  0 


d  sin  0  -  sin  0^  d  (cos  0  sin  6)  4  cos  sin  0 


>  5  DDA  Integrators 


d  (2accos  20)  = 


cos  0  d  sin  0  4  sin  0  d  cos  0 


C_1  2~* 


d  cos  0  =  -  sin  0  d  0 


4  DDA  Integrators 


d  (cos  0  cos  8)  =  cos  0  d  cos  8  4  cos  8  d  cos  # 


d  (cos  0  sin  8)  =  cos#  d  sin  8  4  sin  8  d  cos  # 


4  DDA  Integrators 


d  cos  8  *  -  sin  8  d  8 

d  sin  8  *  cos  8  d  8 


2  DDA  Integrators 


d  sin  (X  -  Xr)  *  x  dX 


dx 


d  (cos  x  cos  8)  -  xd  cos  0 
cos  0 


5  DDA  Integrators 


dX 


dX 


sin  (X  -  X  ) 
n 


dX  4  J*  Odt 


4  DDA  Integrators 


6  QDPU  s  24  DDA  Integrators 


XI- 24 


6  QDPU  *  24  DDX  Integrator* 

rtgur*  11-1.  Complete  Geographic  Coordinate  Calculation*  by  the  Proposed 
QDPU  Multi-Increment,  Without  Double  Integration  Mode 


Xl-25 


6.  Geographic  Coordinates  Calculation  by  the  QDPU  (Multi- 
Increment  Assuming  a  Double  Integration  Mode  Not 
Incorporated  in  the  Final  Design)  -  Figure  11-4  shows  the 
geographic  coordinates  of  the  following  calculations. 


Calculations : 


X 


arc  sin  (  cos  0  sin  0  sin  0q  +  sin  0  cos  0q) 

»  ("cos  0  cos  0~| 

X  -  arc  sin  - z — ■? — 

n  L  cos  0  J 


(XI- 24) 

(XI- 25) 


QDPU  Operations  (Expressed  in  single  increments): 
d  cos  0  -  d  J  J  cos  0  d  0 

d  (cos  0  sin  0)  =  sin  0  d  cos  0  +  cos  0  d  sin  0 
d  sin  0  =  d  J*  j*  sin  8  d  8 

(d  (cos  8  cos  8)  -  cos  8  d  cos  8  +  cos  td  cos  0 


II 


IU 


IV 


d0 


sin  0O  d  (sin  0/ein  fiQ) 


cos  0 


^  cos  8cos  8 
cos  8 


a  (s°LLS21±)  d  cos  0 
(cos  8  cos  8)  A  cos  0  / _ 


cos  0 


d  sin  (X  -  X  )  =  d  f  J*  sin  (X  -  X  )  dX 
n  *  *  n 

.d(s2LiJE2L£) 

\  COS  0  J 


dX  * 


sin  (X  -  X  ) 

n 


X 1-26 


Calculation* : 


(XI- 26) 
(XI- IT) 


0  *  arc  ain  [coa  ♦  ain  9  ain  0,,  +  aln  'i'coa  0fl  jj 

.  .  coa  "frcoa  9 

X  •  X  -  are  coa  -s  ■  — 

n  I  coa  •  I 

where  ■  #  ♦  a  ain  2* 

X^  «  X  +  Q  dt,  9j  X^  *  gaographic  eoordlnataa 
Input* :  d*8,  d*  coa  %  coa  d*  ain  ♦ 


Tiguro  U-i  Geographic  Condinaf  a  Calculation  By  Tha  QOPO  (Multi- 
Increment  and  Aaaumlag  A  Double  latogratioa  Mode  Not 
Incorporating  in  Final  Doatgn) 


Xl-tT 


D.  Flight  Over  the  Pole*  and  the  Longitude  Diecontinuity  -  One  of 
the  moit  etrongly  felt  factors  in  the  application  of  incremental 
computers  in  airborne  and  aerospace  applications  is  the  compu¬ 
tation  problems  in  handling  variables  with  isolated  discontinuities. 
In  a  number  of  recent  system  developments  within  the  field,  in¬ 
cluding  a  full  aerospace  mission  computer  with  GP-OOA,  the 
impress  of  this  factor  has  led  to  highly  discriminated  computa¬ 
tion  allocation  to  the  DDA  and  such  demanding  assignment  to  the 
GP  section  that  only  new  hardware  developments  and  increases 
in  GP  computer  complexity,  already  unwieldy,  presented  hope 
of  approximating  computation  requirements  along  that  design 
path.  The  basic  reason  for  this  impact  of  computer  complexity 
stems  from  the  basic  spesd  disadvantage  of  the  GP  relative  to 
the  DDA  as  discussed  in  paragraph  11. 4D  which  presents  and 
analyses  the  energy  management  computation  program.  That 
the  allocation  to  other  than  OOA  is  not  inherent  for  all  computa¬ 
tions  with  isolated  discontinuities,  is  shown  in  paragraph  11. 4G 
where  the  important  problem  of  doppler  damping  is  shown 
soluble  with  required  accuracy  by  the  ODD*  A.  That  the  alloca¬ 
tion  to  other  than  OOA  computations  for  flight  over  the  pole, 
and  generally  across  a  longitude  discontinuity  is  not  inherent, 
will  be  discussed  here.  It  should  be  noted  that  most  isolated 
discontinuities  in  rsal  time  computation  present  themselves  not 
so  much  as  a  result  of  inherent  needs  of  computation,  but  as  a 
result  of  arbitrary  choice  of  coordinates  which  are  to  be  dis¬ 
played,  as  for  example,  geographic  coordinates.  While  it  is 
true  that  no  fixed  coordinate  system  can  define  path  and  motion 
onadoeed  surface  without  discontinuities,  a  non- fixed  or  condi¬ 
tionally  varying  coordinate  system  defining  the  path  and  motion 
can  be  free  of  important  discontinuities.  Thus,  a 
coordinate  system  which  is  identical  to  geographic  coordinates 


Xl-21 


in  regions  further  th&n  30  miles  from  a  pole,  and  conditionally 
varied  within  regions  close  to  the  pole,  can  be  free  of  important 
discontinuities.  Such  a  coordinate  system  can  serve  as  a  prac¬ 
tical  geographic  coordinate  system  since  actually  the  value  of  pre¬ 
cise  coordinates  is  primarily  away  from  the  pole.  Consider  the 
problem  of  the  longitude  discontinuity  at  +180°  -180°  for  all  lati¬ 
tudes;  here  the  ODA  would  be  required  to  compute  longitude  which 
changes  360°  almost  instantaneously.  The  supervision  oi  this  change 
in  a  DDA  by  GP  is  costly  in  programming;  therefore,  a  DDA  capable 
of  making  this  change  itself  without  extensive  programming  has  a 
real  advantage  over  any  existing  DDA.  It  is  proposed  that  this 
capability  be  implemented  by  a  simple  mechanisation  elaboration 
(not  included  in  decision  modes  of  Chapter  XII  because  of  its  recent 
derivation).  A  new  decision  mode  capable  of  accomplishing  this 
within  the  structure  of  a  ternary  DDA  and  the  QDD8A  has  the  output  ot 
sign  and  an  amplitude  bit  communicated  in  the  natural  manner  of  an 
arithmetic  output.  Here,  the  integrator  or  QDPU  in  this  decision 
mode  puts  out  two  bits  of  information,  a  Y  register  sign  and  most 
significant  bit  rather  than  normal  output  according  to  natural 
overflow  or  general  (quotient)  output  criterion.  The  output  in 
this  mode  requires  no  arithmetic  process  for  output,  but  simply 
reads  an  already  present  register  value  and  does  not  itself  dis¬ 
turb  the  pertinent  Y  register  value  at  that  iteration.  Rather, 
the  output  is  a  decision  command  output  programmed  to  be  an 
output  to  the  pertinent  and  any  other  desired  integrator  or 
QDPU.  The  decision  command  conveys  that  a  step  change  of 
»  given  sign  and  fixed  magnitude  is  to  be  made  in  the  receiv¬ 
ing  y  register.  Thus,  the  longitude  variable  in  a  y  register 
puts  out  sign  of  longitude  and  an  amplitude  bit  (1  if  positive 


or  0  if  negative)  when  the  magnitude  exceeds  180°.  Since 
flight  is  continuous,  the  excess  over  180u  produced  by  the 
last  Ay  becomes  the  correct  change  had  the  360°  step  been 
instantaneously  subtracted  at  that  iteration.  Having  program¬ 
med  accordingly,  the  output  is  an  input  additional  to  the  longi¬ 
tude  increment  to  the  same  y  register.  At  the  next  iteration 
after  discontinuity  crossing  the  decision  response  acts  to  sub¬ 
tract  or  add  1  from  the  y  register,  mechanization  for  which  is 
available  already  by  modal  action  of  an  unused  single  transfer 
unit  ordinarily  required  in  general  (quotient)  algorithm  com¬ 
putation.  Next,  consider  the  more  challenging  design  and/or 
programming  problem  of  geographic  coordinate  computation 
in  which  the  poles  are  closely  or  directly  passed  and  where 
long  term  accuracy  after  passing  the  pole  is  the  prime  con¬ 
sideration.  Preliminary  analysis  indicating  that  a  combination  , 
design  modification  and  special  programming  of  the  problem 
appears  promising.  In  this  approach  which  may  be  called  the 
moving  pole  method,  the  coordinate  system  is  effectively 
modified  only  when  /$/>  90°  -e  where  f  =  \/Z° .  The  approach 
appears  particularly  simple  for  orbital  flight  where  the  approxi¬ 
mate  minimum  pass  distance  from  the  pole  is  essentially  known. 
Upon  entering  the  pertinent  polar  region,  as  detected  by  a  de¬ 
cision  operation  in  the  QDDSA,  the  modified  coordinates  are 
formed  by  incremental  shifting  of  the  pole  from  that  of  true 
geographic  position  through  making  increments  of  orbital  in¬ 
clination  B  in  the  formulae  stated  in  the  preceding  section. 

For  example,  coordinates  may  be  selected  to  have  magnitude 
equal  to  ordinarily  computed  increments  of  pseudo  longitude 
<5jeudo  latitude  and  longitude  coordinates  are  orbital  coordin- 
"vere  perfect  flight  has  zero  pseudo  latitude).  Thus,  as 

Best  Available  Copy 


the  missile  approaches  the  true  pole,  the  vertical  pole  of  the 
modified  coordinates  shifts  away  such  that  the  minimum  dis¬ 
tance  may  be  shown  to  be  c// 2,  and  as  the  missile  leaves  the 
region  then  the  virtual  pole  returns  to  the  exact  position  of  the 
true  pole.  This  proposed  computation  method  assures  long 
term  accuracy  for  orbital  flight  passing  repeatedly  over  or 

near  the  poles.  The  decision  modes  of  a  conventional  DDA  and 

2 

their  counterpart  in  the  QDD  A  appear  to  provide  adequate  basis 
for  supervising  the  moving  pole  method,  but  with  considerable 
programming.  The  goal  of  developing  an  incremental  computer 
capable  of  performing  complete  aerospace  guidance  and  control 
functions  now  assigned  to  a  GP-DDA  system  (see  Chapter  XIV) 
implies  that  provision  should  be  made  to  effect  highly  versatile 
self- supervision  without  appreciable  increase  of  total  compu¬ 
tation  time.  Thus  the  mechanization  of  a  new  decision  com¬ 
mand  mode  is  proposed,  namely,  one  which  generates  a  bit 
output,  one  bit  being  the  sign  of  y  in  a  y  register.  The  other 
is  a  "1"  only  when  /y/<2  Mechanization  of  the  arithmetic 
part  of  this  mode  (which  is  a  generalization  of  the  decision 
command  mode  part  for  computation  of  longitude  across  the 
discontinuity)  is  straightforward.  The  cost  of  making  K  pro¬ 
grammable  has  not  been  evaluated.  Here,  the  quantity  sin  6 
is  programmed  as  y  and  the  decision  command  output  is  sent 
to  a  QOPU  where  it  acts  as  positive  or  negative  full  rate  when 
/y/<2  ,  in  the  computation  of  cos  0,  sin  0.  Ideally,  the  net 

change  of  0  in  the  whole  process  is  zero.  This  is  only  approxi¬ 
mately  assured  provided  the  missile  has  constant  velocity  over 
the  lw  region.  Thus,  probably  a  more  sophisticated  choice  of 
functions  than  sin  0  is  indicated.  Another  method  involving 
direct  count  down  of  all  changes  of  B should  be  examined. 


XI-}  1 


1.  Gravity,  Angular  and  Angular  Rate*  Computations,  for 

Aerospace  Flight  -  The  gravity  computation  for  aerospace 
flight  requires  precision  division  computation.  The  gravity 
computations,  angular  and  angular  rates,  are  determined 
from  the  relations : 


V 


<*  =--5* 

x  R 

(Xl-29) 

V 

«y  *  Hr" 

(XI- 30) 

0 

w  =  O  sin  x 
s 

(XI-31) 

« 

V 

o  y 

X  =  ‘“x  *  R 

(Xl-32) 

&  .  VX 

(XI- 33) 

R  cos  x 

dR=  dV 

s 

(XI- 34) 

*  <  sin  20  cos  Bq 

(XI- 35) 

g  =  0. 485  g  (\ 

y°  •  1 1 

5  1  *  <  sin  20  sin  B 

V 

(XI- 36) 

(XI- 37) 

The  angular  and  angular  rate  calculation  QDD*  A  program 
and  QOPU  integrator  equipment  (for  6,  i,  dwz)  is  sum¬ 
marised  on  the  following  page  and  illustrated  in  Fig¬ 
ure  11-5. 


Xl-32 


3QOPU  «  11DD A  Integrators  of  Programming 
Figure  11*5.  Angular  Rate  Computation  Program  for  the  QDDA 


d.f  usxdt 
dfuuxdt 


1QDPU  *  4DDA  Integrators 


djwxdt 

da 


COS  0 


1QOPU  B  5DOA 


Integrators 


(XI* 38) 


(XI- 39) 


Xl-33 


d  cos  4  =  -sin l>d4> 
d  sin  x  -  cos  ti)dl|> 
dR  store 


1QDPU  =  2DDA  Integrators  (XI-40) 

Double 

Precision 


The  gravity  computations  are  simplified  by  introducing  the 
variables  w,  v  definecTby 

w  =  1.  454  e  v3  sin*  0 


then  it  may  be  shown 


dgxo  -  ^*1  +  (Klov»  -  2u)  d« 

(XI-41) 

dgv0  *  tan  B0dgX0 

(XI-42) 

d8zo  s  dv  ‘  d* 

(XI-43) 

The  calculation  may  be  executed  in  the  QDDA  by  programming 
the  calculations: 


Sxo  (Kxod®) 
da  «  9 

(K*0/2K*0) 

d?  «  f  (Kxodr) 


1QDPU  -  3DDA  Integrators  (Xl-44) 


XI- 34 


da 


2mdv 

v 


d6  s 


2gxdv 


1QDPU  *  6DDA  Integrators  (XI-45) 


1QDPU  a  6DDA  Integrators 

de  «  vdv  -  (2/bzo)  dui  J 

(XI-46) 


A  schematic  showing  the  programming  of  these  gravity 
calculations  is  shown  in  Figure  11-6.  The  gravity, 
angular  and  angular  rate,  programs  combined  indicate 
an  average  integrator  equivalent  of  the  QOPU  of  4-1/3 
including  one  double  precision  mode. 


XI- 3  5 


Figure  11-6.  Gravity  Computation  Program  for  the  QDDA 


(d2gy  «  tan  Bo<lagx  »**d  d*  (Kx#2)  ■  KXt)daff  done  elsewhere 
and  not  counted  here  for  QDDA  or  DDA ) 

3QDPU  ■  1 5  DDA  Integrators 


Xl-36 


11-2.  THRUST  CUT-OFF  CALCULATIONS  BY  THE  QDD2A,  AND  THE 
CONVENTIONAL  COMPUTATIONS  ALLOCATION  APPROACH  AND  LIMITA¬ 
TIONS  -  Perhaps  one  of  the  moet  crucial  computer  operations  is  control  of 
the  rocket  thrust  termination.  One  previous  analysis  concluded  a  general 
purpose  computer  was  necessary  for  the  allocation  with  full  time  devoted 
to  achieve  accurate  cut-off  during  the  period  when  cut-off  is  imminent.  * 

The  cut-off  condition  is  D  such  that  /D/<Dq  where  Dq  =  0.  5  ft/sec  say,  and 

D  «  Ki  VEx  +  Kz  VEy  +  K3  VEjt  (XI-47) 

where  VEjt  -  VPj£  -  Vx 

VEy  -  VPy  -  Vy 
VEy  «  vPe  -  V, 

VpQ  *  desired  cut-off  velocities 
V( )  «  actual  velocities 

The  quantities  K  j  2  j  are  updated  at  low  rate.  In  order  to  limit  the 
duration  of  high  rate  (1000  to  2000  iter/sec)  cut-off  calculation,  which 
use  essentially  the  full  arithmetic  capacity  of  the  CP  computer,  a  low 
rate  calculation  of  the  same  form  as  above  but  with  D0  *  30  ft/sec  has 
been  used  to  turn  on  the  high  rate  calculation  only  when  cut-off  is  imminent. 
There  are  several  implications  of  this  conventional  computation  allocation 
approach  for  GP-DDA  computer  systems  with  relatively  slow  DDA  for 
partial  rather  than  full  aerospace  mission:  # 


•A  concessionary  approach  in  which  craft  control  and  other  functions  are 
temporarily  neglected. 


XI-37 


A.  The  quantities  VX|  y>  z  change  primarily  with  inertial  velocity 

increments  (external  inputs)  during  the  cut-off  decision 

period,  however,  the  gravity  and  coriolis  velocity  changes 

internally  generated  at  low  rate  have  a  granularity  comparable 

to  or  greater  than  the  required  net  accuracy.  A  similar  effect 

of  smaller  magnitude  holds  for  Vp 

x»  y*  z« 

B.  The  interruption  of  the  GP  functions  of  auto- piloting,  while  the 
missile  is  under  very  high  acceleration,  is  required  during  the 
cut-off  decision  making  period  implying  the  possibility  of 
serious  loss  of  thrust  control  at  the  most  critical  time. 

C.  Any  other  functions,  such  as,  telemetering  or  tracking  in  future 
aerospace  programs  would  be  interrupted  at  least  momentarily. 

11.3  SOURCES  OF  COMPUTATION  IMPROVEMENT  AND  GENERAL  QDD*A 
PROPERTIES  -  Computer  design  and  program  allocation  approaches  which 
may  remove  these  limitations  are  as  follows:  Reduced  granularity  of 
internally  computed  components  of  (Vp  -  V)  x>  y>  g  can  be  obtained  with  a 
higher  speed  DDA.  By  taking  advantage  of  the  continuity  of  the  rate  of 
change  of  these  variables,  extrapolation  calculations  by  simple  summation 
(in  essence  assuming  constant  rate  of  change)  can  provide  accurate  values 
during  the  cutoff  decision  period,  even  with  low  rate  inputs  to  the  high  rate 
loop.  For  a  single  increment  DDA,  it  would  appear,  that  considerable 
additional  programming  would  be  required  to  generate  a  whole  word  or,  at 
least  a  several  bit  word  representation,  of  rate  of  change.  A  several  bit 
increment  DDA  would  not  require  the  additional  programming  elaboration. 

The  QDDA  has  natural  cost  free  extrapolation  capability  in  the  high  rate 
loop  when  updated  by  outputs  of  the  low  rate  loop. 

The  interruption  of  computation  functions,  for  other  than  cut-off  compulsions, 
during  the  decision  period  is  obviated  by  not  doing  the  computations  intheGP 
or  (intermediate  rate)  internal  incremental  computation  loop.  The  QDDA,  in 


Xl-M 


input  processing  mode,  can  execute  the  cut-off  decision  calculations  without 
any  interruption  whatsoever,  and  with  at  most  an  equivalent  slow  down  of  the 
computer  of  10  to  15$,  most  of  which  buys  input  processing  capability  in 
addition  to  the  thrust -cut -off  operation. 

11.4  PROGRAMMING  THE  QOOA  FOR  THRUST  CUT -Of  F  DECISION 

CALCULATIONS 

A.  Multi- Iteration  Loop  Set-Up  -  For  a  state  of  the  art  word  rate, 
of  1.4  x  10*  words/sec,  the  QDDA,  designed  to  have  64  QDPU 
(equivalent  to  >250  DDA  integrators  of  program  count),  could 
have  an  iteration  rate  of  218  iter/sec,  if  no  high  rate  input 
processing  were  programmed.  Iiahighrate  loop  tor  thrust 
cut-off  is  performed  at  1400  iter/sec,  the  QDDA  has  a  set  of 
write  heads  for  10  word  lines.  Two  QDPU  suffice  for  the 
thrust  cut-off  calculation.  The  indicated  set-up  would  enable 
assignment  of  4  other  QDPU  to  other  input  processing  functions 
leaving  59  QDPU  for  internal  computations  at  109  iter/sec. 

The  program  count  for  high  rate  input  processing  in  aerospace 
computations  in  the  case  where  there  are  no  strap-down  compu¬ 
tations  may  not  require  more  than  2  to  6  QDPU.  If  more  input 
processing  were  required,  the  cost  of  another  set  of  write 
heads  and  a  small  amount  of  logic  would  be  imposed. 

B.  Multi -Be ration  Rate  QDDA  Computations  for  thrust  cut-off. 

High  iteration  rate  is  required  for  thrust  cut-off  computations 
to  avoid  launch  velocity  error  after  thrust  cut-off.  High  itera¬ 
tion  rate  la  attainable  at  state  of  the  art  word  rates  only  for  very 
short  computation  routines.  Properties  of  rates  of  change  of 
the  variables  involved  in  thrust  cut-off  should  be  exploited  to 


XI- 59 


limit  the  high  rate  computation  program  to  acceptable  site.  In 
the  QDDA,  the  thrust  cut  computations  are  executed  in  a  relatively 
high  rate  input  processing  loop  in  which  other  input  processing 
functions  may  also  be  attained  to  the  extent  of  high  rate 
thrust  cut-off  program  compatibility.  The  quantities  Ki,  K2» 

K3  are  relatively  slowly  changing.  The  velocity  deviation  for 
thrust  cut-off  decision.  D0,  is  assumed  constant.  One  of  the  Kj 
quantities  say  K3  satisfies  K3  >>0,  hence  consider  the  relation 

^  Vex  +  X  +  V£.  "  K|*  X  +  K2*  VEy  +  X  <XI"48> 


as  determining  thrust  cut-off  according  to  the  condition 


D 

*3 


*3 


Do* 


(XI-49) 


The  variation  of  D0/K3  In  one  slow  Iteration  of  the  QDDA  internal 
computation  loop  is  negligible  insofar  as  the  significant  variation 
in  effective  cut-off  decision  level  which  would  result  if  the  varia¬ 
tion  were  neglected  (instead  of  being  0.  5  ft/sec  it  might  be 
0.  501  ft/ sec.  a difflsrence being  far  finer  than  necessary  resolutions). 
With  a  view  of  eliminating  computations  of  D  (or  O/K3)  in  more  than 
one  place  in  the  computer  system. (to  effect  a  program  reduction), 
a  very  precis#  computation  is  sought.  The  thrust  cut-off  program 
switchings  may  then  be  hold  to  the  number  of  thrust  stages  of  the 
mission.  For  high  accuracy  in  a  DDA  computation  of  O  (or  D/K3) 
the  treatment  of  both  K*  and  Vj  quantities  as  variables  at  the  high 
rate  have  an  advantage.  Programming  the  D*  calculation  In  the 
QDD^A  will  be  outlined  and  shown  to  essentially  obtain  this  advantage. 


Xl-40 


Each  of  2  QDPU  are  programmed  to  execute  one  of  the  two 
parallel  channel* ,  a  computation  of  form: 

d(K*VE)  «VEdk*  +  K*dVE  (XI-50) 

u*ing  in  each  QDPU  two  6  register*  and  two  y  register*.  The 

a  a  a  * 

internal  processing  section  computes  d  D*.  d  K*,  d  Kg  at  the 
intermediate  rate  (**100  it/sec)  and  the  QDPU  in  the  high  rate 
loop  picks  these  up  effectively  at  the  Intermediate  rate  (0  being 
communicated  at  intermediate  intervals).  A  quantity  dK*  is 
available  with  several  bits  resolution  at  high  rate  because  of 
the  low  rat*  of  change  of  the  K*.  A  3  bit  transfer  QDDA  mode 
computes  the  K*  quantities  with  full  multi -increment  accuracy 
at  high  rate.  Coneider  a  dVE  quantity  which  is  composed  of 
both  high  and  low  rate  quantities. 

dVE  «  dVEL  ♦  dVj  (XI-51) 

where  dVi  ie  accelerometer  input  at  high  rate  and  dVEj  in¬ 
cludes  computed  gravity  and  earth  rate  induced  changee  at  low 
rate.  Computing d V E and  dVE^  atth* high  rat*  assures  that  VjL 
may  be  multi -increment  and  that  d  VE^  be  consistent  with  this 
sealing,  when  generated  at  low  rate,  in  effect  producing  eub- 
stantially  the  same  outputs,  as  if  generated  at  high  rate.  While 
actual  computation  at  high  rate  would  generate  a  d  VE^  com¬ 
munication  at  intermediate  phases  of  the  long  iteration  interval, 
the  lag  free  algorithm  assumed  in  the  low  rate  computation 
ehould  make  the  alternate  communication  streame  essentially 
equivalent,  la  conclusion,  the  computation  of  d  (K*VE)  in  the 
QDPU  at  high  rate  with  the  particular  low  rate  communications 
is  expected  to  have  multi -increment  accuracy  at  the  high  rate 


XI- 41 


so  far  as  K*  and  Vj.  are  concerned.  The  accuracy  effect  of 
L 

the  external  input  Vj  depends  on  a  maximum  accelerometer 
pulse  rate  and  sensor,  transducer  accuracies.  Pulse  rates 
during  the  next  5  years  are  assumed  under  104/ sec. 

Having  D  or  D*  available  at  high  accuracy,  the  question  of  im- 
plementing  decision  for  thrust  cut-off  in  the  best  way  is  con¬ 
sidered.  At  the  15g  maximum  thrust  level  the  step  sice  of  D 
is  about  l/ 3  ft/sec  at  1400  it/sec.  If  a  decision  for  cut-off 
type  is  the  conventional  which  tests  for 

M < 

then  O  could  be  chosen  0.  25  ft/sec  without  chance  of  passing 
over  the  decision  region.  The  rms  error  using  this  criterion 
is  at  least  0.  25 /V  3*.  0. 15  ft/sec  assuming  precise  computation 
of  D.  This  inherent  error  appears  to  be  entirely  consistent 
with  state  of  the  art  system  accuracy  requirements.  Note 
that  the  step  sice  of  dV^  computed  at  low  rate  is  determined  by 
the  gravity  magnitude  of  lg  and  the  low  iteration  rate.  For 
several  low  iteration  rates  the  step  sice  of  dV^  is: 

1R  Max  Step  Sice  of  V  ^ 

1 5  it/ sec  2.  7  ft/ sec 

50  it/  sec  0.  64  ft/ sec 

100  it/ sec  0-  32  ft/  sec 


Xl-42 


For  1  ft/sec  error  at  launch  a  a  a  maximum  overall  tolerance.it 
ie  concluded  that  dV^  should  be  computed  in  a  single  increment 
DDA  at  least  at  70  to  100  it/sec.  This  exceeds  the  iteration 
rate  of  a  conventional  DDA  with  the  large  program  for  a  full 
aerospace  mission  assuming  state  of  the  art>word  rates.  Three 
bit  increment  QDDA  computation  with  2nd  difference  communcia- 
tion  could  as  a  result  of  the  high  rate  computation  of  dV^  compute 
d  *  Vl  (from  which  dVi_  is  obtained)  at  <15  it/sec  in  internal  com¬ 
putation  and  meet  the  accuracy  requirements.  Decision  criterion 
for  thrust  cutoff  is 

|  D  |  <  D.* 

with  the  conventional  decision  mode  (which  generates  a  decision 
command  consisting  of  the  sign  of  a  quantity  in  y  register).  The 
thrust  cut-off  is  determined  by  the  logical  product  of  two  decision 
command  signals  gsneretod  by  programming  inputs  to  y,  and  ya 
registers  in  the  DDA  to  form 

Yi  ■  -D*  +  D* 

y*  ■  +D*  ♦  Dj 

Because  the  QDPU  can  bs  programmed  for  arithmetic  and 
decision  operations  in  parallel  the  two  QDPU  which  compute 
D*  can  also  generate  the  decision  quantities  for  thrust  cut-off. 

The  program  and  operation  of  the  2QDPU  used  for  thrust  cut-off 
computations  are  presented  in  the  schematic  of  Figure  li-7. 


(Xl-52) 
(XI- 5  J) 


XI-45 


-dD*  Low  Rate 
dvEL* 


Initial  ±  D* 


•  7.  QDPU  Program  for  Thrust  Cut-off  by  Multi -Iteration 
Rate  Computations  of  Input  Processing  and  Internal 
Computation  Routines 


XI-45 


NOTE:  (1)  Multi-increment  dKj,  dVg  quantities  (provided  at  low  rate) 
are  used  at  high  rate  to  obtain  high  rate  accuracy. 

(2)  Formal  integrator  equivalent,  neglecting  high  rate  computations, 

with  dVg  ,  dK  quantities  is  2QDPU  =  6  DOA  Int.  An  ordinary 
L 

DDA  could  not  reduce  granularity  of  dV£  ,  dK  which  have  large 

L 

program  calculations  without  lowering  iteration  rate  an  un¬ 
acceptable  amount. 

The  QDOaA  performance,  in  the  thrust  cut-off  function,  is 
estimated  for  a  number  of  input  processing  iteration  rates  for 
which  additional  input  processing  functions  may  be  executed 
(at  the  same  rate)  which  have  program  lengths  expressed  in 
DDA  integrator  count  which  total  less  than  or  equal  to  tabulated 
values  on  the  basis  of  internal  computation  at  100  iter/sec 
with  >200  DDA  integrator  program  count  (assuming  1.  4  x  103 
words/sec): 


Input  Processing  Iteration 
Rate 

Thrust  Cut-off 
RMS  Error 

Residual  Input  Processing 
Programming  Space 

4500  iter/ sec 

0.  03  ft/ sec 

0 

1400 

0.  15 

12  to  17  DDA  Integ 

TOO 

0.  35 

35  to  44 

400 

0.60 

60  to  80 

15  g  Thrust  Acceleration  assumed  (for  the  lOg  caee  look  at  rows 
with  iteration  rate  1.  5  times  higher  for  performance) 


XI-47 


The  estimate  of  residual  input  processing  program  space 

assumes  1  QDPU  -  4  OOA  Integ.  in  internal  computation  and 

uses  the  formula  (Word  Rate)  =  (IR^  Proc)  (NQ  Re,idual 

_2  )  +  (IR  )  (NQ.  _  )  where  N _  is  a  number  of 

Proc  Int  Comp'  ^nt  Comp  Q 

QDPU. 

Assuming  that  the  tolerable  rms  error  in  thrust  cut-off  is  0.  4 
ft/ sec,  it  is  seen  that,  an  extensive  amount  of  additional  input 
processing  is  available  at>700  it/sec  for  a  14,  000  word  rate. 

C.  Strap-Down  Computations  in  the  High  Rate  Input  Processing 

Loop  of  the  ODD  A  -  Input  processing  for  strap-down  compute- 
tions  is  obtainable  at  very  high  rate  in  the  ODD  A  using  multi¬ 
iteration  rate  time- sharing  with  internal  computation.  The 
case  considered  is  that  for  pulse  stream  transducers  for  the 
rate  gyros  (maximum  pulse  rate  104/sec)  and  state  of  the  art 
bit  rates  for  the  digital  computer.  Figures  11-8,  11-9  and 
11-10  show  that  strap-down  computations  for  inertial  reference 
(inertial  velocities  are  obtainable  with  required  precision  in 
internal  computations  using  the  strap-down  computations)  can 
be  programmed  for  multi-increment  computation  (single  pre¬ 
cision*)  in  5  QDPU.  At  1270  it/ssc  the  input  angular  increment 
is  only  3  bits  multi -increment  hence  single  precision  effected 
by  3  bit  or  4  bit  multi-transfer  is  entirely  adequate  for  state 
of  the  art  pulse  stream  transducers.  Thrust  cutoff  calcula¬ 
tions  requiring  2  QDPU  in  the  same  input  processing  loop  of  11 
word  time  cycles  add  to  input  processing  requirements  which 

*Single  precision  is  3  bit  increment  computation  in  proposed 

QDD*A. 


Xl-48 


External 

Inputs 


Strap -Down  Computations  and  Thrust  Cutoff 
Computations  Executed  at  1270  it/ssc  (0,32  meg  bita/ssc  hardware. 
Internal  Computations  at  90  it/ssc.  Total  integrator  equivalent  of 
program,  256  DDA  integrators. 


Figure  1 1  -I.  Strap-Down  Computation  QDPU  Program  (Craft  Orientation 
in  Inertial  Space) 


XI- 49 


External  Input*  have  their  own  register*.  The  QOPU  program  code  for 
Ii  *  1,  *  1  (contradictory  since  it  call*  for  the  same  quantity  in  two  y 
register*,  hence  defines  modal  action). 


Figure  11-9.  Strap-Down  Computation  QDPU  Program  Type  for 

3  of  5  QDPU 


XI- 50 


I _ I 

External  Inputs  have  their  own  registers.  The  Ax  of  MTX ,  MT. .  MT,  is 
programmed  for  external  input  by  the  selection  Ix  *  Ij  s  1  and  Ax  selection 
code  excluding  6  registers  (Ix  *  I,  =  1  is  contradictory  since  it  calls  for 
two  y  registers  holding  the  same  quantity,  hence  defines  modal  action). 

Figure  11-10.  Strap-Down  Computation  QDPU  Program  Type  for 

2  of  5  QDPU 


XI-M 


total  7  QDPU.  In  time -sharing  with  internal  computation  this 
yields  1270  it/sec  input  processing,  90  it/sec  internal  compu¬ 
tation  assuming,  state  of  the  art,  320,  000  bit  hardware.  Pro¬ 
grammed  internal  operation  of  QDPU  is  of  two  types  for  strap- 
down  computations  as  indicated  in  the  diagrams.  External 
inputs  are  drawn  from  the  preprocessing  loop  which  collects 
pulses  as  a  buffer,  emitting  accumulated  angular  increments 
with  fixed  format  to  the  QDDA  where  used  as  associated  Ax  of 
each  multi-transfer  unit  when  the  QDPU  program  instructs  no 
selection  of  6^,  6^,  6^  register  for  Ax. 

Evaluation  of  QDD  A  For  Computations  of  Energy  Management 
During  Re-entry 

1.  Computation  Structure  and  Comparison  of  CP  and  QDDA 

Performance  -  The  re-entry  computation  program  (assuming 
the  unified  control  approach  as  formulated  by  Daniel 
Dommasch)  is  very  large  (of  the  order  of  250  DDA  integra¬ 
tors).  Re-entry  is  the  most  critical  phase  of  the  entire 
mission  since  the  re-entry  corridor  is  quite  narrow  due 
to  structural  limitations  and  maximum  allowable  heat  input 
to  the  vehicle.  The  pitch  acceleration  command  computa¬ 
tion,  essential  to  temperature  control, involves  decision 
selection  of  functions  involving  division  and  square  root 
which  are  shown  to  be  very  efficiently  generated  in  the 

QDDA  (analysed  In  Chapter  XII).  External  Inputs  Important 
in  the  energy  management  computations.are  pitch  and  roll 


angles,  wall  temperature  and  the  time  rate,  and  air  density 
and  altitude.  These  quantities  change  at  relatively  high 
rates  during  re-entry  and  present  aninsurmountable  compu¬ 
tation  problem  to  a  conventional  OOA  in  generating  the 
extensive  vector  transformations,  spherical  triangle  solu¬ 
tions,  energy  and  aerodynamic  computations  of  the 
program,  and  the  final  command  signal  calculations. 

Recent  efforts  in  aerospace  guidance  and  control  have  been 
predicated  on  allocation  of  the  bulk  of  these  calculations 
to  the  CP  section.  The  copious  supply  of  sin,  cos,  arc 
sin,  arc  and  arc  tan  calculations  in  the  re-entry  program, 
each  one  of  which  requires  0.4 milliseconds  in  advanced  CP  with 
fast  multiplier  and  clock  rate  pressing  the,state  of  the  art, 
are  a  contributing  factor  to  the  low  iteration  rate  of  the  GP 


program  (10  iter/sec). 

The  QDDaA  can  increment  these 

functions  in  the  indicated  word  times  by  executing  the 

indicated  operations: 

d  sin  t  *  cos  6  d0 
d  cos  0  ■  sin  0  d  0 
d 

1  Sinusoid 

J  (  Double  Precision  for  Inputs) 

1WT 

(XI-  54) 
(XI- 55) 

d8  ■ 

V  1  -  X 

/  ,  -xdx 

‘V^-VT? 

Arc  sin  x 

^  (Single  Precision  non-input) 

1WT 

(XI- 56) 
(XI- 57) 

(XI- 58) 

..  -dx 

dO  ■  fmmmam 

Y1  -  X 

\  Arc  Cos  x 

J  (Single  Precision  non-input) 

1  WT 

(XI- 59) 

-xdx 

d  »A  -  xZ 


XI- 53 


de  = 


dx 


(XI- 60) 


2 

1  +  x  t  Arc  tan  x 

J  (Single  Precision  non-input)  1  WT 

d  (1  *  *2>  =  2xdx  (XI-61) 

Without  pressing,  the  state  of  the  art,  a  word  rate  of  20,  000 
words/sec,  the  QDOA  can  update  any  of  these  quantities  in 
0.  05  milliseconds  which  implies  a  speed  advantage  of  a  factor 
of  eight  over  the  advanced  CP.  For  operations  involving 
products  the  speed  advantage  can  be  deduced  as  follows. 

The  example  GP  has  an  operation  rate  (for  multiplication) 
of  about  60  Msec.  The  QDDA  on  the  average  performs  compu¬ 
tations  involving  products  with  the  capability  of  4DDA 
integrators  per  word  time  executing  2  product  incremen¬ 
tations  in  parallel.  Thus  for  a  0.  05  millisecond  word  time  the 
average  product  time  25  Msec  implying  a  2.  4  factor  of  rate 
advantage  for  the  QDD3A  for  products.  A  conventional  ODA 
can  add  variables  as  integrands  without  cost  in  time  but  not 
independent  variables  which  require  quantization.  The  QDD'A 
does  both  without  cost  in  computation  time.  The  comparison 
CP  requires  an  operation  time  for  add  or  subtract  hence  it  is 
just  as  slow  as  for  multiplication.  Assuming  half  the 
operations  in  re-entry  are  add  or  subtract  the  net  speed 
advantage  for  add,  subtract,  multiplication  is  a  factor  of 
4.  8.  Including  the  trigonometric  performance  the  net  speed 
advantage  in  computations  on  continuous  variables  is  a 
factor  of  5  to  8  or  say  6.  5.  This  estimate  holds  'mly  for 


XI- 54 


variables  which  are  well  defined  throughout  the  desired 
phase  of  operation.  The  problem,  arising  from  singularity 
and  other  discontinuity  of  variables  is  a  most  prominent 
ODA  programming  and  design  problem.  The  approach  of 
allocating  the  bulk  of  real  time  computations  to  an  incre¬ 
mental  computer  offers  the  possibility  of  a  mechanization 
with  simplified  CP  or  special  computing  unit  of  reduced 
complexity  given  the  primary  task  of  conditional  whole 
word  communication  and  discontinuity  handling  functions. 
Since  in  critical  cases  a  singularity  is  the  result  of  a  co¬ 
ordinate  system,  the  occurrence  of  singularities  is  usually 
at  easily  located  points.  Usually  only  one  source  of  singu¬ 
larity  occurs  within  a  given  comparatively  long  time  inter¬ 
val.  *  Therefore,  a  simplified  GP  or  special  computing  unit 
of  modest  rate  capability  could  devote,  essentially,  full 
time  to  handling  a  single  singularity  effect  in  a  program 
branch.  Certain  DOA  computations  with  discontinuities 
can  be  handled  using  decision  modes,  specifically,  includ¬ 
ing  those  in  which  cognisance  of  the  valuelessness  of  a 
singularity  variable  at  the  time  of  singularity  is  taken  into 
account  by  ignoring  computed  values;  permanent  error  is 
avoided  by  special  programming.  A  clearcut  example  o i 
this  capability  is  given  by  the  QDDaA  doppler  damping 
program  presented  in  section  XU.  Decision  modes  are 
essentially  free  in  processing  time  in  the  QDDaA  since  full 

*  Doppler  damping  program  analysis  shows  how  the  QDDaA  can  handle  an 

important  problem  in  which  this  condition  is  not  required. 


Xl-55 


arithmetic  capability  (none  being  required  in  decision 
modes)  is  utilized  in  parallel  with  decision  operation. 

While  the  scope  of  this  study  program  has  not  permitted  full 
evaluation  of  all  specific  singularity  problems  which  arise 
in  aerospace  applications,  it  is  believed  that  by  careful 
programming,  utilizing  the  principles  exemplified  in  the 
doppler  damping  QDDA  program,  it  may  be  possible  to 
obviate  use  of  the  GP  for  computations  involving  isolated 
singularities-  The  extent  of  additional  programming  to 
accomplish  this  result,  if  appreciable,  would  appear  to  be 
<20  percent  at  most.  A  minor  mechanization  elaboration 
for  step  changes  as  occurs  in  longitude  can  be  implemented 
by  decision  command  for  sign  reversal  of  a  y  register 
variable  e.  g.  from  +w  to  - w  longitude.  The  major  speed 
advantage  of  the  QDDaA  over  GP  is  in  this  case  retained 
in  the  proposed  system  of  reduced  hardware  complexity. 

QDDSA  Program  For  Total  Pitch  Acceleration  Command  In 
Re-entry  -  The  structural  limitations  and  maximum  allowable 
heat  input  to  the  vehicle  are  critically  determined  by  the  total 
pitch  acceleration  commanded  by  the  digital  computer  during 
re-entry.  A  unified  control  approach  developed  to  guide  the 
missile  to  target  point  within  these  constraints  involves  a 
total  pitch  acceleration  command  computation  that  uses 
decision  selection  between  two  closely  related  command 
alternatives.  These  command  alternatives  computations  may 
be  expressed  in  the  form 


where  0  <  x  <  X  ■  conet  (XI-62) 


I 


where  the  selection  of  sign  is  according  to  the  test 


F  (t)  <  x 


It  will  be  shown  that  the  QOFU  is  highly  efficient  in  executing 
the  basic  control  function  as  a  result  of  capability  of  parallel 
divisions  with  common  divisor  in  the  single  QOPU. 


Introducing  the  notation^ x/1  ■  r  obtain. 


a  .  v  .  .  J  (*  r. :  «1— ,  « 

°St  *  r  -  x  (*  r  -  x)(a  r  -  x  x  (x  -  1  A) 
The  differential  of  0^  may  be  shown  to  have  the  form 


(XI- 6  3) 


-  6  dx  -  du 

a 

x  -  iA 


(XI- 64) 


where  dU  ■  dy  4  d(W) 
dW  -  d  (yR) 

dR  -  1 


•  •  •  • 

The  calculation  form  for  df  and  d8  is  seen  to  be  obtained  in 

T  • 

the  common  divisor  form.  Further,  because  of  the  natural 
feedback  of  output,  the  only  programmed  inpute  are  x  and  u  for 
each  calculation,  which  are  the  same.  A  single  QDPU  can 
therefore  generate  0+  and  0  from  x  and  u  using  the  four 


XI-57 


transfer  capability  in  single  precision  mode  (3  bit  multi* 
increment  in  the  proposed  QDDA).  The  computation  of  W  and 
R  is  given  by 


d R  =  -  —  (XI-65) 

2x 

dW  =  ydR  +  Rdy  (XI-66) 

Both  of  these  calculations  are  executed  together  by  one  QDPU . 
Selection  of  8+  or  8  is  provided  by  the  ordinary  decision  com¬ 
mand  which,  in  the  QDPU,  is  executed  in  parallel  with  a  double 
precision  calculation  because  it  is  an  essentially  free  operation. 
The  incrementation  of  8+  and  8  given  x,  u  in  one  QDPU  re¬ 
quires  one  word  time,  whereas  6  to  8  word  times  are  required 
by  a  conventional  serial  DDA.  Even  if  executed  in  one  word 
time  in  parallel  conventional  single  increment  DDA,  the  result 
would  be  orders  of  magnitude  less  accurate  because  of  the 
relatively  high  rates  of  change  of  x.  The  generation  of  R,  W 
given  x,  y  by  one  QDPU  in  one  word  time  has  a  5  or  6  DDA 
integrator  equivalent.  Figure  il-11  shows  a  schematic  of  the 
QDPU  program. 


Figure  11-11.  Schematic  of  the  QDPU  Program  for  Pitch  Command  During 

Re-entry 


XI- 58 


F.  Challenging  Airborne  Digital  Computer  Routines  and  Possible 
Aerospace  Military  Function  Computations 

1.  Introduction  -  A  number  of  airborne  computation  tasks 
are  not  successfully  handled  by  conventional  DDA 
including: 

a.  Doppler  damping  in  airborne  inertial  navigation. 
This  computation,  which  presents  problems  that 
may  prove  important  in  aerospace  applications,  is 
analysed  in  detail  in  the  next  section  and  pro¬ 
grammed  for  the  QDD^A  as  a  subroutine  in  a  full 
aerospace  mission  computer.  A  computer 
specially  designed  for  airborne  navigation  alone 
could  be  designed  to  meet  state  of  the  art  accuracy 
requirements  by  elaborating  conventional  DDA 
with  the  developed  digital  Stieltjes  integration 
algorithm  and  2  bit  increment  computation. 

b.  Toss  Bombing  and  Fire  Control  -  These  computa¬ 
tions  require  high  rate  variable  computations  with 
modest  to  intermediate  accuracy  requirements  over 
short  periods.  Multi- increment  computers 
surpass  the  conventional  DDA  in  rate  handling 
capability  by  variable  (single)  increment 
computation,  however,  where  more  than 
modest  accuracy  over  short  term  is  required, the 
multi -increment  QDD2A  is  clearly  called  for.  The 
toes  bombing  program  presented  in  Chapter  VI  is 
directly  applicable  to  the  multi -increment  QDD2A 
(by  communicating  second  rather  than  first 


XI- M 


difference!  to  the  modified  QOPU).  The  pro* 
gramming  of  double  precision  mode  in  input 
processing  associated  with  earlier  stages  of  toss 
bombing  computations  would  offer  an  additional 
level  of  accuracy  improvement.  Fire  control 
computations  involve  computation  requirements 
similar  to  toss  bombing  bombing  but.  in  certain 
cases.could  be  much  more  demanding  from  the 
standpoint  of  accuracy  •  therefore  could  require  the 
QDD2A. 

c.  Digital  Autopilot  and  Replacement  of  Stable 

Platform  Analogue  Computers  -  Mechanisation 
savings  are  possible  by  use  of  a  digital  computer 
capable  of  handling  such  functions  as  autopilot 
and  stable  platform  leveling. 

The  possible  aerospace  military  computation  functions 
are  surmised  as  similar  in  computation  structure  and 
computation  requirement  types  to  the  challenging 
airborne  digital  computation  tasks  enumerated  above. 
The  requirements  are  expected  to  be  more 
demanding. 

Most  of  the  computation  problems  of  the  class  of 
computation  routines  discussed  in  this  section  are 
similar  to  those  in  the  doppier  damping  problem 
analysed  in  the  next  section. 


G.  Doppler  Damping  Computations  For  Conventional  Navigation 
(Multi-Increment  QDDA) 

1.  Introduction  -  The  conventional  DDA  is  at  best  marginal* 
for  that  essential  function  in  long  term  navigation 
(>  2  hr):  doppler  damping.  Reasons  for  low  capability 
in  doppler  damping  in  a  DDA  is  (1)  Damping  is  executed 
using  doppler  velocities  measured  in  rapidly  changing 
craft  coordinates  which  must  be  transformed  to  inertial 
coordinates  (2)  Operation  is  subject  to  periods  of 
valueless  information  (for  damping)  which  can  result  in 
large  errors  if  damping  is  used  when  craft  orientation 
brings  the  resultant  doppler  beams  to  vertical  orienta¬ 
tion**.  Several  conventional  computation  approaches 
using  DDA-GP  combinations  have  been  found  lacking. 

They  involve  either: 

(1)  The  approximation  that  craft  orientation  is  on  the 
average  approximately  horlsontal  and  can  be  assumed 
so  continuously  (leading  to  poor  results  in  flight  test), 
or 

(2)  The  supervision  of  the  DDA  by  the  general  purpose 
computer  in  error  sensitive  portions  of  the  routine. 

As  the  GP  has  a  large  program  generally  it  turns 
out  that  the  iteration  rate  of  the  GP  may  be  so  low 
that  the  GP  supervised  variables  change  excessively 
between  iterations  making  the  supervision  inadequate. 

•  Marginal  in  low  accuracy  navigation,  inadequate  in  high  accuracy  navigation. 

**Despite  the  fact  that  a  conventional  DDA  has  equivalent  decision  modes. 


XI-61 


The  approach  of  GP  supervising  several  times  per  GP 
iteration  reduces  this  inadequacy,  but  at  a  corres¬ 
ponding  price  in  programming  length  in  an  already 
cramped  system,  or  by  further  reducing  GP  iteration 
rate.  In  the  chapter  on  computation  capability  the 
conventional  DDA  is  shown  to  be  lacking  even  when 
dampking  accuracy  is  assumed  to  be  substantially  sacrificed. 

The  QOD2A  will  be  shown  ideally  suited  to  doppler  damp¬ 
ing  in  conventional  navigation  without  GP  supervision  by 
virtue  of  QDD^A  features  of: 

(1)  Multi-increment  computation 

(2)  Relatively  high  iteration  rate 

(3)  Decision  modes  enabling  damping  turn  off  and 
pseudo  variable  computation  (described  below) 

(4)  Double  precision  mode  for  error  sensitive 
computations  e.  g.  sinusoid. 

The  quantitative  performance  of  the  QDDA  is  given  along 
with  conventional  DDA  in  the  referred  to  computation 
capability  analysis.  The  equations  for  doppler  damping 
which  later  will  be  put  in  form  for  QDDA  computation 
involve  use  of  altimeter  information  to  obtain  vertical 
velocity  information. 

2.  Raw  Equations  for  Doppler-Altimeter  Deduced  Inertial 
Velocity  -  Two  doppler  beams  in  rigid  craft  coordinates 

Ab  p  =  Ax±c7+bI  (XI- 67) 


Xl-62 


where  A  ■  -  ain  x,  B  ■  ain  x  coa  v,  C  ■  coa  x  coa  v 
yield  from  the  doppler  radar  transducer,  the  atarboard 
and  port  componenta 


(P  +  P  ) 

4  P 


(XI- 68) 


from  pulee  ratea  P>#  which  are  related  to  velocity 
componenta 


AV-  +  CV-  +  BV- 
x  y  z 


(XI-69) 


r 

4  -  aV-  -  CV-  +  BV- 
At  2  x  y  a 

hence 

(XI- 70) 

S  -  AV-  +  BV-  ,  P  •  CV- 
x  b  y 

Craft  to  earth  velocity  vector  traneformation  ia 

(XI- 71) 

V  ■  V—  cos  9  +  V—  ain  $  ain  0  +  V—  ain  0  coa  9 
xx  y  a 

(XI- 721 

V  ■  0  +  V-  cos  0  -  V-  ain  # 

y  y  a 

(XI-7J) 

VZ  ■  -  V—  a  +  V— cob  0  ain  #  +  V— coa  #  coa  #  (XI- 74) 

B  x  ain  0  y  a 

4h 

where  V_  ■  —  ,  h  barometric  altitude.  Doppler- 

jj^  QC 

altimeter0 deduced  velocities  in  computer  inertial 
coordinate  orientation. 

V  ■  V  cos  -  V  sin  x 

XD  X  x  y 

(XI- 7  5) 

Vv  ■  V„  cos  ♦  V  sin  X 

YD  Y  x  x 

(XI- 76) 

XI-63 


3.  Computation  Problems  and  Special  Computation  Methods 
An  important  feature  of  the  doppler  damping  equations  is 
that, for  certain  craft  orientations  (primarily  pitch  angle 
up  so  that  radar  beam  is  vertical)  the  equations  do  not 
yield  a  solution  for  horizontal  velocity.  The  successful 
use  of  a  DDA  in  performing  these  computations  clearly 
requires  that  the  OOA  be  capable  of  decision  functions 
as  well  as  high  rate  handling  capability.  The  specific 
decision  functions  required  are! 

(1)  Damping  turnoff  when  craft  orientation  renders 
doppler-altimeter  deduced  velocity  unreliable. 

(2)  Pseudo  variable  computation  in  place  of  a  variable 
with  unacceptable  analytical  properties  for  DDA 
computation:  i.  e. ,  the  replacing  of  a  variable  with  a 
decision  modified  variable  which  is  equal  to  desired 
variable  when  the  variable  is  used  (during  damping) 
and  when  the  variable  is  not  used  is  a  well  behaved 
variable  for  accurate  DDA  computation.  Clearly, 
the  pseudo  variable  computation  approach  is  a  funda¬ 
mental  one  in  broadening  the  scope  of  DDA  application. 

The  stated  implicit  form  of  the  damping  equations  could  in 
principle  be  solved  only  by  a  DDA  with  servo  mode.  The 
generally  low  accuracy  performance  of  servo  computation 
is  further  degraded  to  unfeasibility  for  this  application 
because  of  the  peculiar  analytical  character  of  the 
calculations  for  certain  craft  orientations.  The  orienta¬ 
tion  sensitivity  is  revealed  by  the  explicit  form  of  the 
equations  stated: 


XI- 64 


(XI-77) 
(XI- 78) 

(XI- 79) 


where 

U  *  ein  0  +  g-  cos  9  cos  0 

X  ■  cos  0  -  g  sin  6  cos  9 

v  ■  g  sin  0 

-A  . 

S  ■  n  cos  0 

f.  •  sin  0  cos  0  +  g  cos  6 

Note  that  u  ■  0  for  cos  0el  when  0  »  0  is  such  that  tan  0 

C  C  13 

a  situation  which  can  occur  with  relatively  high  frequency  during 
craft  maneuver  or  bad  weather.  When  the  critical  Mitch  angle  ®c  is 
approximately  taken  by  the  craft  two  error  effects  degrade  compu¬ 
tation  accuracy: 

(1)  Scaling  of  a  reciprocal  calculation  if  used  is  such  that 
granularity  is  unacceptable.  Looseness  of  a  servo  loop, 
if  used  in  implicit  calculation, degrades  accuracy. 

(2)  The  factors  1/u  of  variables  of  each  of  three  independent 
information  sources  (two  doppler  velocities  and  altimeter), 
each  subject  to  independent  errors,  cause  unacceptable 
noise  amplification  for 


XI-65 


In  general,  for  a  computer,  a  decision  mode  must  cut  off  damping 
in  consequence  of  error  effect  (2).  In  order  to  prevent  rate 
limiting  or  servo  loop  instability,  which  can  permanently 
nullify  the  computation  accuracy  after  a  transition  through 
P  =  0,  it  is  necessary  to  either  reset  DOA  computed  variables 
(expensive  and  usually  inadequate),  or  compute  in  pseudo 
variables  (defined  above).  The  QODsA  may  always  compute 
in  terms  of  amenable  variables  provided  l/p  may  be  available 
when  damping  is  on.  The  other  variables  require  high  rate 
handling  capability  which  the  QDDaA  is  designed  to  possess. 
Generation  of  l/p  when  damping  is  on  does  not  require  that 
1/ M  be  generated  when  damping  is  off.  Because  the  craft 
orientation  pattern  is  not  a  priori  known  certainly  p  must  be 
available  with  good  accuracy  at  all  times.  These  facts 
together  with  the  requirement  of  overcoming  error  effect  (1) 
suggest  the  computation  of  the  variable  l/p*  where 

l/#l*  *  1/p  pi  Po 

=  1/p  P<Po 

choosing  damping  turn  off  when  p  <  Pq,  where  Pq  is  an 
appropriately  chosen  constant.  It  is  seen  that  l/p*  =  l/p 
when  damping  is  on  hence  no  approximation  is  made  during 
periods  of  useful  damping  information.  Appropriate  p 

o 

selection  also  ensures  that  no  scaling  problem  is  presented 
in  reciprocal  computation.  This  type  of  computation  used 
in  l/p*  computation  might  be  called  function  limiting.  The 
QDOA  has  a  function  limiting  mode  which  is  highly  efficient 
in  equivalent  integrator  performance  because  several  other 
operations  may  be  performed  in  parallel.  The  hardware 
cost  of  the  mode  is  minor  because  the  other  parallel  operations 
are  achieved  in  the  normal  QDDaA  program. 


4.  QDE?A  Calculations  Involving  Decision  Modes  for  the 
Doppler  Damping  Case  -  The  QDDA  computations  of  the 
primary  feedbacks  dD*,  y  for  doppler-altimeter  damping 
should  include  the  following: 


dDx 

■ 

KidCx* 

(XI-80) 

dDy 

• 

KidCy* 

(XI-81) 

where 

dC{, 

y  ■ 

fdCx,  y  if  U  *  UO 

r  da  if  u  *  uG 

U  ,  da*  « 

1  - 

L  0  if  u<  uQ 

dCx 

(aVjjdt)  -ao  (Vxidt) 

■ 

- ? - 

(XI- 8  2) 

dC 

y 

■ 

(aVydt)  -ao  (Vyjdt) 

a* 

(XI- 8  3) 

the  quantities  Vxldti  Vyjdt,  being  the  inertial  velocity 
increments  computed  in  the  inertial  navigation  routine, 
and  OdVXl  OdVy  being  the  doppler-altimeter  deduced 
quantities  of  the  damping  routine  formed  by  quantising 
component  terms  separately  computed.  Were  computa¬ 
tion  precise, then  when  a*  ■  a,  dcX|  y  ■  ^a^x,Y  _L  - 

Vx>  yjdt.  The  chosen  computation  procedure  is  pre¬ 
ferred  rather  than  computation 

(aVx,  ydt) 

dCx,  y  *  - aeY  *  V*.  yldt  (Xl-84) 


XI-67 


which  requires  less  programming.  This  occurs  because 
any  long  term  errors  which  develop  in  division,  in  the 
chosen  procedure,  have  the  unimportant  effect  of  changing 
the  damping  constants  slightly  rather  than  destroying  the 
accuracy  of  doppler  deduced  error  velocity  as  in  the 
simpler  computation. 

The  QDPU  has  the  capability  of  utilizing  one  decision 

command  signal  (per  word  time).  A  marked  input  to  the 

6  register  identifies  the  decision  command  signal  D 
m 

(where  D  =  1,0)  which  then,  according  to  programmed 

decision  mode  of  the  QDPU,  modifies  transfer  action  of 

6  register  contents  after  the  unmarked  inputs  to  6 
m  m 

have  been  used  to  update  the  6^  register.  This  readily 
mechanised  design  feature  accords  especial  efficiency  to 
the  QDPU  in  the  decision  modes  since  otherwise  normal 
programmable  inputs  distributed  to  the  three  6  registers 
presents  the  maximal  variables  for  operation,  and  other¬ 
wise  normal  programmable  transfer  action  retains  full 
operations  versatility.  The  QDPU  program  schematics 
for  operations  involving  function  limiting  and  signal  cut¬ 
off  for  the  doppler  damping  application,  are  presented  in 
Figures  11-12  and  11-13  on  the  following  pages.  In  this 
example  of  function  limiting,  2 QDPU  perform*  the  pro¬ 
gram  routine  of  10DDA  integrators  (duplicated  operations 
in  Ki/a*  calculation  reducing  DDA  requirements  from  14 
to  10).  In  the  case  of  signal  cutoff  operation,  exemplified 


*ln  terms  of  hardware  processing  rate  (actually  the  greater  precision  of 
QDDaA  by  multi -increment  computation  and  algorithm  amounts  to  an 
additional  speed  advantage). 


XI -68 


in  the  eecond  QDPU  achemetic,  the  efficiency  would 
probably  be  reduced  eomewhat  from  the  latter  performance 
in  certain  other  examples  than  doppler  inertial  damping. 
The  capability  (and  requirement  in  this  case)  of  whole  word 
drift  correction  called  for  external  input,  external  output 
quantities,  leading  to  efficiency  of  1QDPU  for  5  ODA 
integrators. 


Computation  of  K|  dC^  «  K| 


1  QODU  •  1  ODA  Integrators 

Figure  11-12,  Doppler  Damping  Without  GP  Supervision  (Multi-Increment 

QDPU  Using  Decision  Mods) 


XX-69 


I — I 


a 


L  iD  ^1*1?  l_Settingj  _bu>y 


In  the  decision  mode,  the  first  input  to  6  register  picked 

m 

up  is  used  as  a  decision  variable  to  modify  transfer  of  6 

m 

contents  formed  by  updating  with  the  remaining  inputs  to 

6  The  decision  command  variable  is  generated  in  the 
m 

parallel  channel  of  input  processing  calculations  since  no 
multi-transfer  operations  are  required  for  decision  nor 
available  in  fully  efficient  input  processing. 

The  KidCy  computation  may  be  regarded  as  having  only 
the  performance  1  QOPU  *  3  DDA  Integrators  since  in  a 
DDA  the  quantity  d(Ki/a+)  would  be  available  in  KidC^ 
calculation.  The  subroutine  is  evaluated  2  QDPU  =  10  DDA 
Integrator. 

5.  Single  Precision  and  Double  Precision  Calculation  Alloca¬ 
tion  -  One  of  the  most  powerful  features  of  the  QDD3A  de¬ 
sign  is  that  it  enables,  according  to  the  particular  computa¬ 
tion  requirements  of  a  calculation  routine,  the  allocation  of 
appropriate  computation  capacity  to  meet  those  require¬ 
ments.  Analogous  to  the  general  purpose  computer,  which 
is  capable  of  double  precision  programming,  the  proposed 
incremental  computer  may  use  brute  force  where  necessary 
and  not  otherwise.  Because  the  QDDaA  has  been  designed  on 
a  more  fundamental  logical  level  than  the  conventional  GP 
it  can  achieve  double  precision  by  arithmetic  unit  trans¬ 
formations  rather  than  by  data  flow  programming.  Thus, 
double  precision  is  obtained  with  only  a  reduction  factor  of 
2  in  instantaneous  processing  rate,  whereas  in  a  GP  the 


XI- 71 


the  reduction  is  4  to  7*.  In  the  particular  problem  of 
doppler  damping,  analysis  indicates  that  double  precision 
(input  processing)  is  not  necessary  throughout  the  calcu¬ 
lation  because  0.  2  percent  accuracy  is  quite  adequate. 
Special  subroutines,  however,  require  double  precision 
which  are  error  growth  sensitive  and  which  enter  in  the 
computations  in  such  manner  that  damping  does  not  attenu¬ 
ate  error  buildup.  The  sinusoid  calculations  on  input 
angles  are  the  case  in  point  and  have  been  allocated  double 
precision. 

6.  Doppler  Damping  Program  for  QDD8A  Computation  -  The 
doppler-altimeter  deduced  inertial  velocity  calculations 
and  a  portion  of  the  blended  calculations  for  feedback  and 
inertial  operations  are  summarized  below  in  differential 
form.  QDPU  allocation  is  indicated  and  evaluated 
relative  to  the  conventional  DDA  (in  cases  where  the  relative 
evaluation  is  deducible  only  by  sets  of  QDPU,  corrections 
are  interspersed  to  give  the  correct  total  evaluation). 


•  The  design  approach  dsvsloped  in  this  study  could  be  applied  to  the  design 
of  a  GP  with  adaptable  precision  and  maximal  speed  performance. 


Xl-72 


d  <VxdJ«y->  ■  «  ■  CvV  *  «CD  D>ci<ion  ]d  J.  d. 
Inertial  Accel  Biae  Corrected  Cutoff 


d  Fib'  dt  =  SF' d  [u>  dt  +  buu  dt 

J  y  y  J  y  y 

External  Output 
(Gyro  Torque) 


Analogous  for  x-*y,  y-*x 

dKlcK .  k,  t(av>,“)  vstm 


Analogous  for  x  -»  y 

where  [«(VX-V^)dt  -  dWjt  ♦  d  ?x  ♦  Vj|  -  tooVx  dtl] 

*nd  Vx  is  inertial  velocity 


1QDPU  *  5DDA  (XI-85) 
Integrators 


(XI-86) 


1QDPU  ■  5DDA 
Integrators 

1QDPU  *  7DDA  (X1.87) 
Integrators 

1QDPU  ■  3DDA 
Integrators 

(XI- 88) 


XI-73 


dBcc 


coa  0d  cos  6  +  cos  6d  cos  0 

fiP 


d?„  * 


-  cos 


|>8+Pp)dt1 
[  4At  B/X  J 


(XI-89) 


1QDPU  *  5DDA 
Integrators 


(XI-90) 


dB 


CS 


cos  0d  cos  8  +  sinfld  cos  0 

(s)'1 


d?„  »  sin 


r  <p,+  pp)  dt  i 

|_  4At  B/\  J 


(XI- 91) 


1QDPU  *  5DDA 
Integrators 


(XI-92) 


d  (V  cos  x)  «  dV  cos  x  +  V  d  cos  x 

x  X  X  •) 

J  1QDPU  *  4DDA  Integrators  / vT  Qi> 
d  (V^  cos  y)  «  dVy  cos  x  +  Vyd  cos  x  "  * 

d  (V  sin  x)  «  dV  .  sin  x  +  V  .  d  sin  x 

*  X  X  1 

.  .  j  1QDPU  ■  4DDA  Integrators  (XI-94) 

d  (V  sin  x)  *  dV  •  sin  x  +  V  •  d  sin  x 

y  y  y 


dv  = 

x 


>«.»-»  ]dj  A,  /Vd.\ 
S  *b\  b  / 


.0 


A 

B 


.  <vv* 

44*BA  44*BA 


flWT  ■  3DDA  Integ  (XI-95) 


XI-74 


du  =  sin  t>  (dt?  +  d?  )  +  .in  <t>d  f  £  SV  (V  dt)  -  £  d* 

y  y  y  B  zb  zb  b  y 


d  r  £  SV  (V  dt)  =  ^  °v  •  (V  dt) 

B  ZB  ZB  B  ZB  *B 


A  S. 


1 WT  *  4DDA 
(XI- 96) 


d4 


dll 


■in  0  C(P,-Pp)dt] 


4At  c/X 


1  WT  *  3DDA  Integ 


cos  8  [(Pa-Pp)dt1 


4At  c/X 


(XI-97) 


(XI- 98) 


d  sin  0  *  cos  0  d8 


d  cos  0  *  -sin  0  d0 


1  WT  .  2DDA  Integ  ( ^1.  Pr.ci.ionj 


(XI-99) 
(XI-  100) 


d  sin  0  «  cot  8  d6 


d  cos  •  »  -sin8  d8 


1  WT  >  2DDA  Integ 


^Double 


Precision. 
Mode  ' 


(XI-101) 
(XI-  102) 


Calculation  uses 


9  QDPU  in  Single  Precision  *  40  DDA  Integrators 
2  QDPU  in  Double  Precision  ■  4  DDA  Integrators 


In  terms  of  pure  rate  of  routine  processing  possible,  the  result  is  1  QDPU  = 
4  DDA  integrator  multi-increment  and  double  precision  sinusoids  raise 
effective  speed  by  a  supplementary  factor  in  excess  of  8  i.  e. ,  ■  30  times 
conventional  DDA  rate. 


*  The  equivalents  in  DDA  integrators  is  one  of  word  times  per  operation 
ignoring  QDPU  superiority  in  algorithm  multi-increment  and  double  precision. 


XI-75 


CHAPTER  XII 


PROGRAMMABLE  TRANSFER  OPERATIONS  OF  THE  QDPU 
AND  QDPU  PROGRAMMING  CODE  STUDIES 

12.  0  INTRODUCTION  -  The  level  of  versatility  of  the  QDPU,  reflected 
in  the  aerospace  modal  computation  program  studies  of  Chapter  XI,  requires 
a  certain  minimal  set  of  programmable  transfer  operations.  A  mechani¬ 
zation  can  realize  the  minimal  set  in  many  equivalent  ways  as  a  result  of 
the  equivalence  of  members  of  sets  of  like  registers  and  of  transfer  units. 

The  problem  of  developing  a  programming  code  which  implies  efficient 
mechanization  involves  attaining  the  minimal  programmable  set  with  a 
code  of  acceptable  storage  and  decode  simplicity.  Mode  action  of  the  QDPU 
is  effected  by  programming  transfer  operations  and  output  criterions  ac¬ 
cording  to  a  program  code,  which  is  stored  in  the  auxiliary  memory  as¬ 
sociated  with  each  QDPU.  A  program  code  will  be  delineated  which  enables 
programming  all  the  transfer  actions  (each  involving  the  4  multi -transfer 
and  5  single  transfer  units).  Having  formed  a  concept  of  the  minimal  re¬ 
quired  set  of  programmable  sets  of  transfer  actions,  the  problem  of  de¬ 
termining  a  program  code  which  may  subsequently  enable  economical  mech¬ 
anization  is  investigated.  The  optimal  code  depends  in  considerable  degree 
on  whether  core  or  drum  memory  is  used.  Generally,  however,  the 
factors  which  measure  code  optimality  are  the  code  length  and  decode 
complexity.  The  minimal  code  length  is  about  9  bits  presenting,  in  short 
form,  however,  a  more  complex  decode  logic.  On  the  other  hand,  the 
same  program  expressed  in  code  of  length  16  to  20  bits  can  have  relatively 
simple  decode  logic  but  requires  more  flip  flops.  In  the  following  analysis, 
the  primary  objective  is  to  define  the  minimal  set  of  desired  programmable 
transfer  operations  of  the  QDPU,  on  which  a  code  for  minimum  hardware 
costs  may  be  derived. 


XU- 1 


The  programmable  set  will  be  expressed  in  a  code  of  intermediate  length 
which  may  serve  as  a  base  for  developing  an  equivalent  code  for  the  QDPU, 
efficiently  adapted  to  the  mechanization  type.  Eachmulti-tr  nsierunit,  MT,  and 
single.tr anster unit,  ST,  must  be  assigned  in  each  programmed  QDPU  mode 
action  to  have: 

A.  a  register,  the  contents  of  which  are  transferred  by  the  unit 
(a  6  or  y  register). 

B.  a  register,  the  contents  of  which  determine  degree  and  kind  of 
transfer,  i.e.  (independent  variable)  0  register. 

C.  a  register  or  registers  to  which  the  transfer  unit  results  are 
added  (if  added),  i.e.  R  register  (or  for  internal  transfer)  if 
it  were  mechanized  a  y  register. 

The  statement  of  the  programmable  set  is  facilitated  by  the  register  labeling 
indicated  in  Figure  12-1.  The  identical  function  of  the  same  register  and 
transfer  types,  that  is  6  registers,  y  registers,  R  registers,  multi-transfer 
unitsor  single  transfer  units  and  assumption  of  complete  programmability 
of  inputs  implies  that  the  same  computation  can  in  principle  be  effected  in 
a  large  number  of  equivalent  programs  of  transfer  actions.  This  indicates 
the  probable  existence  of  a  simplified  program  code  for  a  subset  of  transfer 
actions  which  call  for  all  desired  modes  of  computation  of  a  perfectly 
general  code. 

Before  developing  a  simplified  QDPU  program  code,  consider  the  impli¬ 
cations  of  complete  programmability  of  transfer  operations.  Each  of  4 
multi-transfer  units  would  have  independent  variable  selections  corre¬ 
sponding  to  any  one  of  5  6 -register  contents  or  full  rate  or  zero  rate,  and 
transfer  variable  selections  for  3  y  registers.  The  results  can  go  to  R, 
or  R  i  or  R •  and  R-  ,  which  implies  252  alternatives.  Each  of  5  single 
transfers  would  have  independent  variable  selections  corresponding  to  any 


XlI-2 


of  5  6  -  registers  or  full  rate  or  zero  rate.  The  results  can  go  to  any  one 
of  3  y-registers  or  to  Rj  or  Ra,  which  implies  175  alternatives. 


Figure  12-1.  QDPU  Register  Labeling. 


12. 1  MULTI- TRANSFER  AND  QUOTIENT  ALGORITHM  MODE  CODE  - 

The  four  multi -transfer  units  are  assigned  y  values  which  need  not  be  pro¬ 
grammable.  The  outputs  of  the  multi-transfer  units  go  either  to  Ri  or  Rs 
registers.  No  loss  of  computation  versatility  results  by  requiring  transfers 
ot  yx  and  ym  to  Rt  without  programmability.  The  remaining  trans¬ 
fers  of  y»  and  y  should  go  to  Rs  or  Rt  programmably.  The  case  of  y 
is  of  use  only  in  double  precieion  mode;  hence,  it  is  called  for  in  that 
mode  only. 


xn-s 


Thua>  the  multi -tranafer  operationa  T  update  R  regiatera  R  according  to  the 
equationa, 


M  ■  MTi  (yi  ,  Axj,  Rin_t)  (X1I-1) 

Rj**  *  MT,  (ym,  Ax,,  R?n  )  (XII-2) 

(Ri  or  R|  )  ■  MTa  (ya»  Ax,,  (Ri  *  or  R*^  )  according  aa  M  *  1,  0  (XII- 3) 

0  000 

(Ra  or  R,)  ■  MT,  (ym,  Ax «,  (Ri  or  R,nl)  )  according  aa  M  •  1,  0  (XII-4) 


where  m,  M  are  multi -tranefer  unit  programming  bite,  and  Ax^  are  programmed 

independent  variablee  which  for  a  Ax  may  be  A, ,  6  ,  6a  or  x  where 

n  m 

{0  for  Ax^  and  Ax, 

6^  for  Ax,  (if  eelected  induce  a  quotient  algorithm) 

6f  for  Ax«  (if  eelected  inducea  quotient  algorithm) 


All  Ax  eelectione  are  defined  by  8  programming  bite 

dj  Aj  dj  Aj  dj  d« 

Complete  vereatility  of  multi -tranefer  modea  ia  obtained  by  the  eet  of  program¬ 
ming  bit  a 

mM  A}  d]  Ae  d,  Aa  de  da  da 

Theee  bita  alao  determine  certain  actiona  not  indicated  by  the  R  update  equationa; 

A.  When  m  ■  1,  M*  t  double  preciaion  mode  in  which  all  tranafera  to 
Ri  occur. 


XU-4 


B.  When  m  »  1,  M  ■  1,  yx  inatead  of  ys  is  used  in  T3; thereby*  ob¬ 
taining  double  precision  of  Rx  transfers  while  not  using  the  ya 
register  for  it. 

C.  When  m*l,  M  *  1  the  ya  register  is  used  for  decision  command 
output  of  R,  channel  according  to  sign  of  ya. 

D.  When  m  ■  0,  M  *  1  double  precision  with  MTX ,  MTa,  to  Rx  and 
MT3 ,  MT4  to  Ra  is  called  for  (there  being  no  decision  command 
mode  in  this  case). 

E.  When  Axa  *  5a,  quotient  algorithm  in  Rx  and  when  Ax*  ■  6b 
quotient  algorithm  in  Ra. 

12.  2  SINGLE  TRANSFER,  DECISION  RESPONSE.  AND  INTEGRATION 
ALGORITHM  MODE  CODE  -  The  five  single  transfer  units  are  used  to  update 
y  registers  and  R  registers  programmably.  The  update  of  y  registers  depends 
on  whether  constants  are  desired  in  certain  y  registers  or  decision  mode  ac¬ 
tion  is  sought. 

The  update  of  R  registers  with  single  transfer  is  automatic  for  &a,  to  Rx  R, 
respectively  if  non-quotient  algorithm  (Ax,  4  fla,  Ax*  ■  6b)  and  if  Ax,  ■  fta  is 
automatic  for  Ax  to  Rx . 

The  following  equations  indicate  single  transfer  unit  action  according  to  re¬ 
sponse  and  single  transfer  programming  bits  D,  Ix ,  4  •  L  (it  being  understood 
that  the  operations  indicated  other  than  single  transfer  and  logical  multiplica¬ 
tion  with  decision  command  variable  D^  serve  merely  the  purpose  of  defining 
applicable  equations  by  substitutions  of  1  or  0  for  program  bits): 

Tt  ARX  +  Ix  Ayx  ■  STX  (Sx )  (XII-5) 

*m'Dm®ST»<I*6»+6m1’>  <XII-o> 


XII-5 


(XII-7) 


DAyx  +  DAya  ■  Za  •  STs  (D6  +  0  6,) 

m 

ARj  ■  ST*  (ia)  If  Ax,  ■  6a 
AR,  «  STb  (6b)  if  Ax*  >  6b 


(XII-8) 
(XII  >9) 


Integration  algorithm  for  yx  differs  fromy,,  ym  if  an  algorithm  bit  A  ie  1. 
The  first  half  of  the  series  of  QOPU  has  lagged  algorithm  unlees  A  *  1  in 
which  case  yx  has  unlagged  algorithm.  The  second  half  the  reverse  inter¬ 
pretation  holds.  A  summary  of  single  transfer  programming  bite  is 

Dli  1,  I.  A 

not  including  A*  d,  ■  00,  A*  d*  ■  00  for  ST*  and  ST*. 

12.  3  SUMMARY  OF  QOPU  PROGRAM  MODE  CODE  -  The  total  code  ie 

mMAjd|A|d|iidaA*d«OI|li(|A 

totaling  15  bits. 


XII  -fc 


CHAPTER  XIII 


LOGICAL  DESIGN  INVESTIGATIONS  OF  SECOND  DIFFERENCE  INCREMENTAL 
COMPUTERS  WITH  CONVENTIONAL  AND  GENERAL  (QUOTIENT)  ALGORITHM 

13.  0  INTRODUCTION  -  The  new  concepts  of  multi 'increment  computer 
design  with  second  difference  computation,  communication,  and  A  registers, 
developed  in  Chapter  VII  and  verified  in  simulations  described  in  Chapter  IX, 
clearly  present  a  wealth  of  new  factors  in  logical  design.  The  relatively  high 
resolution  of  first  differences,  implied  by  computation  of  second  difference 
single  increments  for  the  bulk  of  variables  typically  involved  in  internal  computa¬ 
tion  rather  than  direct  input  processing,  provides  a  basis  for  attaining  new  levels 
of  precision  where  communication  structure  may  have  the  same  simplicity  as 
the  conventional  DDA.  The  basic  digital  processing  unit  (generalised  DDA 
integrator  or  DPU)  is  capable  of  multi -increment  computation  with  general 
(quotient)  algorithm  for  the  first  time  and  with  remarkable  digital  processing 
simplicity.  The  question  arose  as  to  what  further  digital  processings,  based 
on  second  difference  computation,  appeared  natural  in  logical  design  structure 
of  the  basic  digital  processing  unit  granting  some  license  initially  in  consis¬ 
tency  with  system  function.  Results  could  then  be  adapted  as  basic  design 
techniques  for  portions  of  a  full  scale  incremental  computer  system,  with 
potentially  significant  rewards  in  overall  mechanisation  simplicity. 

The  first  logical  design  effort  was  concentrated  on  die  development  of  a 
multi -transfer  unit  especially  adapted  to  second  difference  inputs.  If  the 
second  difference  is  single  increment,  a  derived  multi -transfer  unit  which 
may  be  called  the  D*  multiplier,  offers  a  mechanisation  simplicity.  This 
compares  to  the  simplicity  of  a  conventional  3  bit  multiplier,  capable  of  an 
Mbit  multi -transfer,  where  only  the  scaling  of  the  second  difference  input 
for  single  increment  limits  the  value  of  M.  Such  a  unit  in  the  DPU  is  capable, 
for  example,  of  serial  computation  of  a  sinusoid  with  time  as  the  independent 
variable  with  20  bit  resolution  where  steps  are  10  bit  increment.  Conventional 
design  methods  would  require  a  multiplier  of  several  times  the  mechanisation 


1 


complexity.  In  a  simulation  computer,  in  generation  of  analytic  functions, 
the  D'  multiplier  provides  remarkable  economy.  In  aerospace  applications 
the  D*  multiplier  (evaluated  in  Chapter  VII,  paragraph  7,  3)  presents  limi¬ 
tations  in  handling  external  inputs.  Hybridization  with  the  conventional 
multiplier  as  appropriate  restriction  of  application,  should  be  a  significant  ad¬ 
vance  in  multi-increment  computer  design  technique.  One  generalization  of 
the  O  multiplier,  which  may  be  called  the  PD  multiplier,  is  utilized  in  the 
PDD*  A  computer  (derived  in  paragraph  13.2).  With  only  slight  increase  in 
complexity  over  the  Dz  multiplier,  the  PD2  multiplier  incrementation  of  a 
product  usually  requiring  two  distinct  multi-transfers,  is  possible  under  the 
same  conditions  on  computation  variables  as  in  the  case  of  the  D2  multiplier. 

A  conventional  two  transfer  mechanization  for  product  with  the  same  speed 
as  the  PD*  multiplier  hai  flip-flop  requirements  a  factor  -p  =  -“-times 
greater  complexity.  A  DPU  with  two  D*  multipliers,  or  two  conventional 
multipliers,  is  capable  of  executing  the  general  (quotient)  algorithm.  The 
D'  multiplier  offers  here  the  first  discussed  mechanization  saving,  provided 
scaling  of  single  increment  second  difference  inputs  is  acceptable.  The 
second  kind  of  basic  logical  design  development  in  second  difference  compu¬ 
tation  is  that  of  natural  second  difference  overflow  for  non-division  algorithm, 
in  which  one  register  is  deleted  from  the  originally  proposed  mechanization. 
The  mechanization  for  natural  overflow  offers  a  saving  relative  to  the  mechani¬ 
zation  for  general  (quotient)  algorithm  in  non-quotient  operation.  Only  the 
general  (quotient)  algorithm  has  been  simulated  and  verified  in  simulation, 
specifically  for  quotient  operation,  an  operation  which  implies,  because  of 
generality,  a  consistent  operation  in  the  less  demanding  non-quotient  opera¬ 
tion  (where  divisor  is  unity  or  a  whole  word  constant).  The  new  overflow 
mechanization  should  be  simulation  evaluated  to  determine  the  level  of 
roundoff  error  implied  by  an  apparently  reduced  residue  retention. 

The  complete  logical  design  of  the  DD*A  incorporating  the  concepts  of 
second  difference  computation,  communication  and  6  registers  for  input 


Xlll-2 


accumulation  together  with  the  developments  of  the  D  multiplier  and  natural 
second  difference  overflow  is  presented  to  demonstrate  the  concrete  mechani- 
zation  of  these  concepts.  Finally,  the  configuration  of  a  QOO  A  capable  of 
full  aerospace  mission  computations,  including  required  input  processing 
and  internal  computations  at  new  levels  of  accuracy,  is  presented.  Estimates 
of  the  flip-flop  an<l  diode  requirements  of  the  computers  are  presented  in 
Table  13-6, 

13.  1  THE  DD*  A 

A,  Structure  of  DD*  A  integrator  -  The  structure  of  the  DD*  A 
integrator  is  suggested  by  the  form  of  the  second  difference 
of  the  numerical  approximation  J  ydx.  Assuming  trapezoidal 
integration, *n  +  1  =  Z(X  )  *  E(w  +  )  Ax.,  where  y.  rep- 

A »  a  •  * 

resetns  the  value  of  y  at  the  beginning  of  the  i  _  step  (so  that 
y.  is  the  initial  value  of  y).  At  the  n^step,  the  second  dif¬ 
ference  of  Zn  is  given  in  equation  (13-1). 

A’Z"  *  Vl  *  7  ‘Vn  *  A*  ,„4V'  "3-‘» 

The  term  within  parentheses  is  the  first  difference  of  Ax  Ay  > 

n  a 

so  that  (13-1)  can  be  written  as  (13-2). 

A*Z  «  y .  A*x  ♦  Ax  Ay  ♦  ^  A(Ax  (13-2) 

n  n  ♦  l  n  n  n  i  nn 

These  equations,  and  the  assumption  of  two-or  three-valued 
second  differences,  permit  the  arithmetic  part  of  the  DD*  A 
integrator  to  be  mechanized  as  shown  in  Figure  13-1. 


XU1-J 


daZ 


1  -  1 

i 

1 

V  1 

| 

dy  \ 

4  +dax 

1  dydx  | 

dx 


mu 


day 


dx'dy* 

dax 


Figure  I  3-  I .  Arithmetic  Mechanization  of  the  DDa  A  Integrator 


For  rectangular  integration  the  term  -jd  (dx  dy)  is  not  added 
to  r.  Each  indicated  transfer  is  controlled  by  a  second  dif¬ 
ferential  (generated  by  some  integrator),  or  is  always  additive, 
so  that  each  transfer  is  simple,  second  differentials  being 
only  +  1,  -1,  or  0.  The  over-flow  mechanism,  producing  d*  Z, 
is  the  same  as  that  used  in  the  DOA  to  produce  a  three-valued 
dZ. 


B.  DD*  A  Scaling  Equations,  -  The  structure  of  the  DD*A  integrator 
is  further  clarified  by  scaling  considerations.  The  existence 
of  an  X  register  is  a  convenient  assumption.  This  is  not  a  part 
of  the  real  integrator  use  of  the  X  register  but  assists  in  the 
proper  scaling  of  X. 


Xlll-4 


Let  nx  be  the  number  of  places  in  y,  n«  the  number  of  places 
in  dy,  n*  the  number  of  places  in  dx  and  n,  the  number  of 
places  in  X  (in  each  case,  excluding  sign).  Assume  m  +  na  = 
nj  +fu .  This  assumption  is  natural,  and  in  product  formation 
in  the  OOA  is  necessary.  Then  the  integrator,  with  the  x 
register  added,  has  the  form  shown  in  Figure  13*2. 


Figure  13-2.  Integrator  with  the  X  Register  Added 


XIII- 5 


Let  Su,  for  any  y-number  u,  be  the  scale  factor  of  u,  so  that 

Su 

the  real  value  of  the  variable  u  is  2  times  the  machine  value 

of  u  (this  being  bounded  by  1  in  absolute  value).  Let  w  (du), 

for  any  differential  du,  be  the  numerical  value  of  (du)  in 

Sdu 

real  u-units,  and  let  Sdu  be  such  that  2  w(du)  =  1  u-unit; 
let  w(d2u)  and  Sdsu  be  an  analogously  defined.  Equations 
(13-3)  follow  at  once. 


n»  =  Sy  +  Sdy 

n  i  =  Sd**  y  -  Sdv  (XIII- 3) 


n?  =  Sd"~  x  -  Sdx 


n<  =  Sx  +  Sdx 


The  proper  alignment  of  the  integrator  registers  requires 
knowledge  of  the  weights  of  (yd*X)  max,  (dy  d*  X)  max  and 
(dx  d  y)  max,  as  these  functions  enter  into  the  arithmetic 
operations.  These  weights  are  given  in  equations  (13-4). 


(yd*  x)max  = 
£ 

(dy  d  x)max  - 
(dy  d*  y)max  - 


^Sy.  2Sx*  2~  +  n*  * 


_Sv. 

,  ,  Sx . 

-  (ns  +  n,  ) 

2 

2-ni  .  2  2 

Sx. 

2 

2- n»  .  2Sy,2 

-(n5  +  n«  ) 

(XIII -4) 


(v  d*  X)  max  is  the  numerical  value  of  a  full  r  register,  which 
gives  the  scale  of  d*  Z  as  in  (13-5). 

w(d*Z).  =  2Sd‘Z:2Sy+3x-(n'  +n4)*  or 


Sd*  Z  =  ns  +  n<  -  (Sy  +  Sx). 


(XIII-5) 


Xltl-b 


(dy  daX)  max  =  2  ^  .  (y  da  X)  max,  which  assures  that  the 
positioning  of  dy  to  permit  its  addition  to  y,  as  required  by 
its  scale,  is  consistent  with  its  positioning  with  respect  to 
dx  dy,  to  permit  the  addition  of  dy  da  X  to  the  latter  quantity. 
That  is,  dy  adds  to  y,  nt  places  from  the  most  significant 
end  of  y,  and  dy  da  X  adds  to  dx  dy,  n*  places  from  the  most 
significant  end  of  r.  Similar  relations  hold  for  dx.  This  shows 
that  the  alignment  in  Figure  13*2  gives  a  proper  structure  for 
the  arithmetic  part  of  a  DO*  A  integrator.  This  leads  directly 
to  a  mechanization  in  which  each  register  is  held  in  a  channel, 
on  a  drum,  or  in  Cores,  the  registers  being  processed 
serially. 

Finally,  from  n4  ♦  n#  s  n,  n«,  (yd*  X)  max  *  (xd*y)  max. 
Although  the  X  register  is  not  a  part  of  the  simple  DD*  A 
integrator,  the  presence  of  the  register  is  assumed  In  seal* 
ing  x,  so  that  the  use  of  n^  is  meaningful. 

C.  Input  Scaling.  •  The  integrator  must  have  a  decoding  segment 
in  addition  to  its  integrating  unit.  The  decoding  segment  per¬ 
forms  the  selection  of  those  outputs  which  are  its  own  inputs, 
and  their  fusion  into  three-valued  second  differentials,  d*y 
and  drx.  Second  differentials  will  now  be  treated  as  three¬ 
valued,  rather  than  two-valued,  to  afford  greater  accuracy. 
Two-valued  second  differentials,  however,  permit  a  very 
significant  simplification  in  machine  design,  and  should  be 
used  if  consistent  with  accuracy  requirements.  The  fact 
that  each  integrator  output  is  three-valued  does  not  assure 
tnat  a  d*y  is  also  of  this  form,  since  there  may  be  several 
inputs  to  d*y.  To  enable  the  input  second  differentials  to 
be  represented  in  this  simple  way,  the  assumption  is  now 
made  that  inputs  to  a  d*y  (or  d*x)  are  scaled  as  follows: 


X1I1-7 


The  first  two  appearing  in  the  serial  processing  of  the 
integrator  have  the  same  scale.  Those  which  appear  there¬ 
after  are  scaled  upward  by  powers  of  2  (the  third  input  would 
have  twice  the  weight  of  each  of  the  first  two;  the  fourth,  four 
times  this  weight;  the  fifth,  eight  times,  and  so  on.  The  in¬ 
puts  to  d*y  (or  d*x)  now  resemble  a  binary  number,  differing 
in  that  the  two  least  significant  digits  have  equal  weight,  and 
in  that  each  digit  may  be  -1  as  well  as  +1  or  0.  The  decoding 
process  for  d*y  consists  of  identifying  the  inputs  to  d*y,  treat¬ 
ing  them  collectively  as  a  number,  and  then  adding  this  number 
into  an  r  register;  the  final  three-valued  carry  resulting  from 
this  additive  process  is  d  y. 

Inputs  are  selected  for  d*y  from  the  two  d*  Z  memory  channels 
and  Za  ,  bv  code  marks  held  in  an  extension  of  the  y  register. 

The  small  register  holding  the  residue  of  the  additive  process 
is  an  extension  of  the  d>  register.  Similar  registers  are 
present  for  the  decoding  of  d*x.  The  integrator  word  structure 
is  shown  in  Figure  13-3,  along  with  the  d*Z  channels,  which 
are  shared  bv  all  integrators. 


Figure  13-3.  Integrator  Word  Structure 


D,  Output  Modes.  -  The  integrator  is  completed  by  adding  a  set 
of  code  marks  in  the  last  bit  position  to  permit  sign  reversal 
of  the  output,  and  to  make  use  of  a  decision  process.  A  mark 
at  this  point  in  the  dy  register  results  in  the  qutput  being  set 
to  1  or  0  depending  on  whether  or  not  y  is  positive.  A  mark 
in  the  y  register  results  in  the  normal  addition  of  the  dy 
register  to  y  or  its  replacement  by  0  depending  on  whether 
the  Z  channels  hold  1  or  0  at  this  point.  In  this  way  the  sign 
of  y  numbers  may  be  used  to  effectively  replace  the  y  number 
of  the  controlled  integrator  by  a  constant,  in  response  to  a 
cut-off  signal.  The  sign  reversal  mark  is  placed  in  the  dx 
register.  An  origin  mark  is  also  placed  in  the  dxdy  register 
to  distinguish  one  integrator,  as  the  first. 


The  final  form  of  the  integrator,  including  an  associated 
marking  channel,  is  given  in  Figure  IJ-4. 


Decision  k 
Command  Mode 


Output  Sign 
of  y 


Sign 

Reversal 


HE 


2 


-ELH2 

Origin  Mark  jfr  |  ) 


2 


IE 


Figure  li-4.  Final  Form  of  the  Integrator  Including  an 
Associated  Marking  Channel 


XlU-9 


E.  Programming  For  DDA  and  DD*  A.  -  The  interconnection  of 
integrators  required  for  a  given  computation  in  the  DD2  A  is 
identical  to  that  required  in  DOA.  The  scaling  equations 
which  must  be  satisfied  are  equations  (13-3)  and  (13-5)  of 

1 3- IB. 

F.  Logical  Equations  for  DD*  A. 

1.  Memory  Structure.  -  The  logical  equations  will  be  derived 
in  terms  of  the  word  sturcture  given  in  Figure  13-4,  and 
the  memory  structure  shown  in  Figure  13-5,  The  read 
flip-flops  in  the  recirculating  channels  have  the  subscript 
"l"  and  the  write  flip-flops  the  subscript  "2";  the  r  channel 
is  one  bit  shorter  than  the  other  channels,  this  bit  being 
added  bv  passage  through  A^;  the  channels  in  the  mechani¬ 
zation  to  be  given  do  not  recirculate. 


Figure  13-5.  DD*  A  Memory  Structure 

l.  Phases.  -  Pj  and  P*  are  used  to  give  the  Decode  and  Inte¬ 
grate  phases  of  the  integrator,  as  well  as  the  Idle  and  Com¬ 
pute  controls,  shown  in  Table  13-1.  Decode  passes  to  Integrate 


xm-io 


TABLE  13-1 


Pi  Ps 

0  0  Idle 

0  1  Decode 

1  1  Integrate 

at  Pi  P £  Ft  ;  Integrate  passes  to  Decode  at  Pi  Fj  Fz .  Idle 
is  entered  from  integrate  at  the  origin  mark  if  the  Stop-Go 
button  is  down,  and  idle  passes  to  Decode  at  the  origin  mark 
if  the  Stop-Go  button  is  up.  The  equations  for  Pi  and  P<-  are: 


SPx  =  P,  P2  F, 


ZPj  *  Fj  Fj 

SPa  *  ^  F,  F,  D,  Go 


(XIII -6) 


ZP2*  P2  F,  Fa  D,  Stop 

G.  Decode  -  The  decoding  operation  for  d‘y  will  be  treated,  that 

for  d3x  being  analogous.  Code  marks  appear  in  Bi,  and  the  in¬ 
puts  in  Zi  and  Z7.  K,  is  used  to  distinguish  the  first  input  from 
those  that  follow,  and  Yx  and  Y2  are  the  d3y  flip-flops.  The  values 
corresponding  to  the  flip-flops  states  are  as  listed  in  Table  13-1. 


TABLE  13-2 

Yi  Ys  Z8 

11  111 

0  0  0 

10-110 


XIII- 11 


The  first  input  is  transferred  to  the  day  flip-flops,  and  thereafter 
by  the  scaling  assumption  given  in  13- 1C.  The  value  held  in  these 
flip-flops  is  equal  in  weight  to  that  of  the  next  daZ  decoded.  When 
a  code  mark  appears  in  Bj  the  partially  formed  d*y  in  Yj  and  Ya 
is  added  to  the  d8  Z  in  Zj  and  Z£.  If  the  sum  is  +2,  0  or  -2  it  is 
passed  on  to  the  next  code  position,  where  the  d8  Z  has  double  the 
weight  of  the  last.  If  the  sum  is  1  or  -1,  the  residue  bit  in  Cj  is 
used.  The  appearance  of  a  1  in  Ct  records  a  deficiency  of  1  in 
what  was  transmitted  in  Yj  and  Y?  the  last  time  a  sum  of  1  or  -1 
occurred.  This  deficiency  is  removed  by  sending  2  or  0,  respec¬ 
tively;  also,  a  0  is  written  in  the  residue  position.  If  a  sum  of  1 
or  -1  arises  and  the  residue  bit  is  a  zero,  Oor  -2,  respectively, 
are  transmitted  and  a  1  written  in  the  residue  position.  The  last 
value  held  in  Yj  and  Yt  is  d‘y. 

The  terms  of  the  logical  equation  related  to  Decode  are  given  fcr 
d' v  and  dax;  Xj  and  Xj  are  set  to  1  at  Pj  Fj  Fa  so  that  no  dax 
code  mark  will  result  in  a  dax  of  1.  The  term  "  A-B"  is  defined 
by  A-B  s  AB  ♦  AB.  Also,  T5  and  T*  are  used  where  these  are 
given  by! 


T-  =  <Zj  Yj  (Ci 
Tf  *  (Z,X,  (Ej 


Zt)  *  z ■  Yj  (Cj  -  Y-)  )  +  B, 
Z9)  +  Z,  X;  (E;  -  X?)  )  +  A, 


SK.  =  Pj  B;  Fj 


ZK.  =  P,  Fj  Fa 
SK-  *  Pj  Aj  Fj 
ZK-  =  Pj  F,  F? 


(XIII  -7) 


XIII- 12 


SY!  =  P:  (ic,  Bi  Zi  ♦  K!  Yx  T5) 

ZY1=  P -x  Fl  Fs  +  P,  (Ki  B,  Zt  ♦  Kx  Yx  Ts) 
SYs  =  Pt  (K:  Bj  Zs  +  Ki  Tr  Ci ) 

ZY8  =  Pa  (Kj  Ba  Z2  +  K,  Te  Ca ) 

SC*  =  P,  Ps  (Ca  -  Z»  -  Ya ) 

ZC8  =  Pa  Ps  (Ca  -  Za  -  Ya  ) 

SXa  =  P,  Fa  F*  +  Px  (K„  At  Zj  +  Ks  X»  T. ) 
ZXV  =  Pa  (Ks  A,  Zt  ♦  K,  X,  T,  ) 

SXg  »  Pa  r a  Fs  +  Pt  (K,  A,  Z*  +  K*  T.  Ex ) 
ZXj  »  Pa  (Ks  Ax  Z*  +  Kj  T|  Ex ) 

SE,  =  Pa  P,  (E»  -  Z,  -  Xx  ) 

ZEa  »  Pi  P8  (Ej  -  Zx  -  Xx  ) 


H.  Integrate  -  The  equations  defining  Integrate  are  given  in  this 
section,  and  should  be  considered  operation  by  operation. 

The  sign  convention  is  to  give  non-negative  numbers  a  sign 
digit  of  zero.  Q  is  the  state  Px  (F|  +  Fa). 

1.  dy  +  day 

Ka  is  used  as  the  carry,  and  is  set  to  1  at  Px  Fx  if  Yx  »  1. 


The  equations  follow. 

SKj  =  Px  Fa  Ya 

ZKj  =  Pi  Fj  Ya  +  Pj  (Ys  -  Cj ) 


(XIII-8) 


XIII- 1 3 


SC2  =  Q  (Cl 
ZCE  =  Q  (C, 


K.  )  +  P,  F;  Fs  C1 
Kt)  +  Pj  F>  Fg  C: 


2.  dx  +  d'  x  -  The  equations  are  similar  to  those  of  (1). 


SKg  =  P,  F,  X! 

ZK?  =  P:  F!  X.  i-  P  (  Xr:  -  E: ) 
SEg  =  Q;  <E,  -  Kg  )  +  Pj  Fj  F-  E; 
ZEj  =  Qj  (E:  -Kj)  +  P;  F;  Fa  E; 


( XIII -9 ) 


3.  y  +  dy  -  Kj  is  used  as  the  carry 
SK*  =  P;  Ks  B,  C; 

ZK3  =  Pj  Fi  +  P:  K-  Bi  C, 

SB,  *  Q  (Ks  -  Bj  -  C; )  +  Px  F<  F»  Ba 
ZB,  »  Q  (Ka  -  B.  -  Cj )  +  P,  F-_  Fa  B. 


(XIII-10) 


4  d2y  dx  ♦  d' xdy  +  d‘yd?x.  -  Subtraction  is  to  be  accom¬ 
plished  by  complementation  and  addition.  This  means  that 
there  must  he  an  initial  carry  for  each  such  complementa¬ 
tion,  that  must  absorb  d£xd£v.  If  both  d*x  and  d?y  are  -1, 
there  will  be  an  initial  carry  of  3,  so  that  a  double  carry 
is  needed.  These  two  flip-flops  are  K,  and  K.,  and  their 
states  are  defined  in  Table  13-3.  They  are  set  initially  at 
Pi  F,. 


XIII- 14 


TABLE  13-3 


K=  K4 

0  0  0 

0  1  1 

1  0  2 

1  1  3 

In  the  equations  which  follow,  Tx ,  Ts  and  Sx  are  given  by: 

Tx  =  Xj  (Cj  -  %),  TP  =  Yj  (Ea  -  Ys)  and  S;  =  Tx  -  T-  -  K, . 

SK,  =  PiFx(Xi  Yx(Xa  -  Y8  )  +  Y,  X:  %  r  Y.  Ys  Xa )  P:K,(Ks  -  Tt  Ts  ) 
ZKj  =  Px  Fj  Fs  +  Px  Itj  (K,  -  T  T?) 

(XII1-11) 

SK*  *  Px  Fx  X,  X,  Yx  Ys 

ZKe  =  Px  Fx  Fs  *  Px  K*  (K,  (T;  +  Ts )  +  T;  Ts ) 

Si  is  the  sum  dyd8x  4  dxd*y  4  d8yd8x,  and  is  added  to  dx  dy. 
while  is  is  added  to  r. 

5.  dx  dy  +  Sx . 


K*  is  the  carry 

Sit  «  PiReStD;  SDs  =  Q  (Sx  -  Dx  -  K«  )  ♦  P,  F;  Fs  Dx 

ZK«  «  P;FX+PX  X»5,T5a  ZD-  *  Q  (5:  -  D:  -  K  )  *  P.  F:F?B: 


6.  r  4  dy  dx  4  sign  of  S-  .  -  As  r  passes  from  A*  to  As  dxdy  is 
added  to  r.  In  the  addition  of  1/2  Sx  to  r,  the  sign  digit  of 
Sj  is  added  at  Px  Fx  F?  when  the  sign  of  r  is  in  Ax .  At  the 
same  time,  the  sign  digit  is  added  to  the  second-last  r  digit 
while  passing  from  Aa  to  A3 .  By  virtue  of  the  sign  conven¬ 
tion  used,  this  procedure  effects  the  addition  of  1/2  S-.  to  r. 
K7  is  the  carrv  used. 


XU1-1* 


SKy  =  Pi  K,  At  D, 

ZKj  =  Pi  Fj  +  Pj  Kj  Aj  . 

SAs  =  Pi  (A}  -  D|  -  K>  •  F-  Fa  Si ) 

ZAa  =  P,  (A,  -  D,  -  K,  -  F,  Fa  S, ). 


(XIII-13) 


7.  r  ♦  y  dax  +  1/2  Sj .  -  As  r  passes  from  Aj  to  Aj,  it  adds 

to  ydax  and  1/2  S: .  The  term  T,  =  S.  (B-.  -  Xg  )  (F,  +  Fa  ) 
is  used  for  ydax,  as  the  latter  does  not  add  to  r  at  P|  Fj  F?. 
Also,  T*  =  Si  (F]  ♦  Fs )  is  used  for  1/2  Si>as  this  does  not 
add  to  r  at  Pi  F}  Fs  . 

Ts  and  T«  are  first  added  using  K«  as  the  carry;  Sg  is  the 
sum  in  this  addition.  Sg  and  r  are  then  added  using  Ka  as 
the  carry,  with  Sa  as  the  sum.  Sa  is  the  new  r  digit  except 
at  the  sign  position  where  S*  must  be  completed  if  overflow 
(of  either  sign)  occurs.  The  equations  follow. 


5r 

■  T, 

-  T«  -  K. 

SK. 

•  P> 

F,  F8  X,  X,  4-  P, 

ZK, 

*  P» 

♦  P,  K.  T.i  T * 

SKo 

*  P» 

K,  A*  S, 

ZK* 

*  P| 

F,  ♦  P,  K.  A,  S, 

Si 

iK, 

-  A,  -  S, 

(XII1-I4) 


X  111*16 


SAj  =  PA  (S3  (Ft  +  Fa )  -  Aj  F1  F3  ) 
ZA3  =  P,  (5:,  (f  j  +  Fg )  -  Aa  Fx  Fe  ). 


If  Lj  and  L.>  are  the  expressions  for  positive  and  negative 
overflow,  respectively,  they  are: 


Lf  =  pi  p;  (Aa  S,(K,  -  Aj )  +  (Aa  -  S3 )  Aj  Kg  ) 
L2  =  Pj  F4  Fj  (Aa  St(K^  -  A*)  (Af  -  Sj  )  A<j  R* ). 


(XIII  —  15) 


Their  sum.  excluding  impossible  cases,  is 


Lj  +  4  1  Kj  -  Aj  -  Aj  *  Sj  • 


<  XI II  - 1 6 ) 


Output.  -  Code  marks  appearing  in  Bj  Cx  and  Et  affect 
the  output  communicated  to  other  integrators.  If  there  are 
no  marks  at  this  point  the  normal  overflow  given  in  the  lost 
section  is  used.  Pj  Fj  Fa  E,  indicates  sign  reversal:  P 
Ej;  Fj  Cj  requires  that  tne  sign  of  y  replace  the  normal 
overflow,  y  *  0  be’ ng  sent  as  I  and  y  <  0  as  0;  Pa  Fx  F, 

Bj  Zx  cause  the  normal  overflow  to  be  transmitted  and  P4 
F4  Fj  B4  Zx  force  an  output  of  zero. 

Let  Oj  and  tlj  represent  1 1  iv*  output  as  in  TabK'  13-4.  Then 

TABLE  13-4. 

0,  0a 

1  1  1 

0  0 
I  0  -1 


XIII- 1 7 


0,  and  0g  are  defined  as  follows. 


SOi  =  P:  F,  F;  (Bj  C^L.  +  La)  +  B;  Z.  (L,  +  Ls )  +  C:  B,) 

ZO,  =  P1  Fx  Fa  (B,  C-_  L,  L-  +  Bj  Zj  Lj  L2  +  B;  Z,  +  Cj  Bs ) 

SOa  =  P;  F,  Fa  (L*  E,  +  L-  EJ 

ZOa  =  P,  Fj  Fa  (Ls  Et  +  Lx  E,  )  (XIII-17) 


Complete  Logical  Equations.  -  The  unsimplified  logical 
equations  are  now  listed. 


T-.  =  X,  (Cx  -  Xs) 

S- 

=  T.  -  Ta  -  IC, 

Ta  =  Y,  (E.  -  Ya) 

s. 

=  T3  -  T4  -  Ke 

T3  =  Xi  (Bl  -  X?)(F:  -  Fa) 

s. 

*  Sa  -  Aa  -  Ka 

T,  =  S,  (F;  +  Fa) 

Q 

*  P;  (F,  +  Fa ) 

Ts  s  (Z;  Yj  (C:  -  Z3)  +  Z,  Y,  (0,  -  Y?)  )  +  Bj 
T?  =  (Z.  X.  (E;  -  Ze)  ♦  Z.  X,  (E;  -  X-)  )  +  Ax 


V) 

X 

=  P»  Fx 

B,  +  P: 

F. 

Y; 

ZK; 

=  Pi  Fj 

F?  ♦  P, 

F- 

Y, 

♦Pi 

(Y? 

-  C;) 

SK- 

=  P  ;  F, 

A;  4  P; 

F, 

X; 

ZKj 

=  P,  F i 

F-  +  P: 

F. 

X 

+  p. 

(Xr 

-  E.) 

SK, 

=  P;  K, 

B.  C 

ZK, 

=  Pi  F; 

+  P;  K, 

B: 

Cl 

(XIH-lh) 


SK4  =  P,  E.  (X.  Y,  (Xr  -  Y- )  +  Y.  Xj  Xr  +  X:  Y.  Y~  )  +  P,  K*  (1C  -  T-  Tr) 


XIII- 18 


ZK*  =  Pi  F,  P2  +  Pi  K4  (Ks  -  Tj  Ta ) 

SK$  =  Pt  F»  Xt  Xs  Yi  Ya 

ZKg  =  P,  Fj  Fs  +  Pi  Kg  (K*  (Tj  +  T8)  +  Ti  Ts) 
SKc  =  Pj  K*  Si  D. 

ZK*  =  P,  Fi  +  P.  K,  S-  Di 
SK,  =  Pj  K.  A-  D, 

ZK,  =  P  F,  +  P,  It,  Ai  5, 

SKe  =  Pi  F;  Ff  Xi  X?  +  P,  K>  T-  T* 

ZK*  =  Pi  +  P,  K,  Ta  T* 

SKa  =  P  .  K,  Aa  Sa 

ZKa  =  P,  F,  f  Pj  Ka  Ap  Sa 

SYi  *  P:  (K.  B:  Zj  +  K:  Y,  T: ) 

ZY;  «  P:  F!  Fa  +  P.  (It  B-.  Z.  +  K:  Y,  Tr ) 

SY,  «  Pi  (Ki  B,  Z,  *  K;  T*  C- ) 

ZY-  «  P.  (K.  Bt  Z8  +  Ki  T  Ci) 

SX;  -  P:  Fj  Fa  +  P,  (K.  A;  Z  +  K,  X;  T» ) 

ZX.  «  P.  (Ks  A:  Z.  +  K?  X.  T-  ) 

SP.  ■  P,  P?  P, 

ZP.  »  F.  Fa 

SP?  =  P,  F.  Fr  D-.  Go 

ZPj  *  P?  F-  Fa  D  Stop 


XIU- 19 


L,  =  P;  Ft  Fa  <Aa  Sa  (Ke  -  Ag  )  +  (Aa  -  Ss )  Ag  Ke  ) 

=  Pj  F s  ( Ag  Sg  (Ke  -  Ag  )  +  (Ag  -  Sg )  Ag  Kg 
S0:  =  P:  F.  F2  (BT  C,  (Lj.  +  Lg)  +  B;  Z-  (Lj  +  La )  +  Cx  Ba 

ZO-  =  P:  Fj  Fg  (Bj  C,  L.  Lg  h  B:  Zj  Li  La  +  Bj  Zj  +  Cx  Ba) 

SO.  =  P,  F,  Fa  (  L,  E:  +  L.  E,  ) 

ZO.  =  P;  F-  Fg  (L  Ej  +  L,  Ei ) 

SAr  =  P.  A,  +  Pg  (Pj  A:  +  P,  (A,  -  Dj  -  K.  -  F,  Fa  S._ )  ) 

ZA.  =  Pa  A-  r  P?  (P,  A,  -  P,  (A  -  D-  -  K.  -  F.  Sa )  ) 

SA-  =  P.  Ag  ♦  p?  (P,  A?  *  P.  (Sr,  (F  +  Fg)  -  Ag  F,  Ff )  ) 

ZA,  =  Pg  Ag  +  Pg  (P,  Ag  +  P;  (S.  (F,  +  Fa)  -  Ag  F:  F,)  ) 

SB,  =  Pr  B)  +  Pr  (P-  B-  +  Q  (Kg  -  B:  -  C, )  +  P,  F*  Fa  B,  ) 

ZB?  *  Pr  B:  h  P.  (P.  B.  ♦  Q  (Kg  -  B*  -  Cj )  +  Pi  F,  F,  B.  ) 

SCr  =  P.  C5  +  Pr  (P:  (  C-  -  Z,  -  Y, )  +  Q  (C:  -  K, )  +  P,  F,  F?  C, ) 
ZCg  =  Pr  Ci  +  P-  (P,  (C,  -  Z,  -  Y,  )  +  Q  (C:  -  KJ  +  P,  F5  F,  Ca ) 
SDr  =  Pg  D-  +  Pg  (Q  (S.  -  D:  -  K, )  +  Px  F,  F?  5, ) 

ZDg  =  Pg  5;  +  Pg  (Q  (S.  -  5,  -  K*  )  +  P,  F.  Fg  5, ) 

SEa  =  Pg  E,  +  P.  (P.  (E:  -  Z,  -  Y,  )  *  Q  (E,  -  Kg )  +  P,  F»  F,  E. ) 

ZEg  =  P.  E:  +  Pg  (P,  (E.  -  Z,  -  X,  )  +  Q  (E,  -  Kg)  +  P.  F,  Fa  E.  ) 

SF-  =  F, 

ZFr  =  F 


XIII- 20 


13.  2 


THE  PDDaA  AND  QDD3 A  MECHANIZATIONS 


A.  Introduction  -  The  DDaA  described  in  the  previous  sections 
was  based  on  a  computational  process  which  provided  a  basic 
improvement  in  rate-handling  ability  and  accuracy  with  respect 
to  that  of  the  DDA  based  on  second  difference  computation. 
Incremental  computer  designs  which  further  exploit  this  basic 
design  approach  to  obtain  increased  processing  efficiency  and 
computation  precision  are  the  PDDaA  and  QDDaA  computers 
described  below. 

B.  PDDaA  -  If  the  computation  involves  a  considerable  number  of 
multiplications  (vector  resolutions),  or  If  many  constants  appear 
in  the  equations  to  be  solved,  a  sixth  register  may  be  added  to 
the  DDaA  integrator.  This  will  be  the  x  register  referred  to 

in  13- IB,  and  its  presence  will  effect  a  reduction  of  the  number 
of  integrators  required  in  the  sort  of  computation  mentioned. 

First,  the  generation  of  a  product,  x  y,  in  a  DD*A  is  accomplished 
by  a  double  summation  of  its  second  difference.  This  second 
difference,  at  the  n*h  step  of  the  process,  is  given  in  equation 
13-6.  Except  for  the  coefficients,  the  right  hand  side  of  13*14 
differs  from  the  right  hand  side  of 


equation  (13-1)  of  13-1A  only  by  the  presence  of  *n+|  A*  yn» 
With  the  x  register  available,  this  term  may  be  added  to  r  by 
a  transfer  controlled  by  d*y.  The  output  of  the  unit  shown  in 
Figure  13-6  is  the  second  differential  of  the  product  xy,  and  a 
PDD*A  is  a  machine  consisting  of  such  units.  The  decoding 


xin-xi 


procedure  is  the  same  as  that  in  the  DDA,  and  the  scaling 
equations  are  still  (13-3)  and  (13-5)  of  13-1B. 


Figure  13-6.  PDD^A  Output  of  the  Second 
Differential  Product  of  Xy. 

Only  one  PDDSA  unit  is  required  for  pioduct  generation  as 
opposed  to  two  DD*A  units,  and  the  use  of  a  single  r  number 
instead  of  two  improves  the  accuracy  of  the  computation. 

The  presence  of  the  sixth  register  also  permits  the  introduction 
of  scaling  factors  by  a  modification  of  the  overflow  mechanism. 
Normally,  the  DD®A  integrator  output  is  developed  by  noting 
whether  the  r  number,  after  the  addition  of  yd-  x  +  dx  dy  ♦ 

1/2  d  (dxdy),  is  more  than  1/2  or  less  than  -l/2.  In  the  first  case 
the  output  is  1,  in  the  second  -1,  and  if  neither  case  obtains 
the  output  is  0;  further,  the  output  is  subtracted  from  r  to 
yield  the  r  number  for  the  next  step.  This  same  process  may¬ 
be  carried  out  with  respect  to  +^u  '2  instead  of  _+ 1  2,  where  u 
is  a  constant  held  in  the  x  register.  The  output  will  now 
represent  d  rather  than  d  (ydx),  and  arbitrary  scale 

factors  may  be  introduced  in  this  way.  The  transfer  of  the  x 


X  111- 22 


number  (u)  to  r  ie  now  controlled  by  -daz,  where  daz  ie  the 
integrator  output  on  the  last  cycle,  and  dx  does  not  add  to  the 
x  register. 

Thus,  the  sixth  register  results  in  the  saving  of  one  DD8  A 
integrator  for  each  product  generation,  and  of  one  DD  A 
integrator  for  each  multiplication  by  a  constant,  when  not  an 
integral  power  of  2. 

C.  Elementary  QDDaA  -  The  addition  of  a  seventh  register,  as 

well  as  a  third  input;  yields  a  unit  capable  of  quotient  generation; 
this  unit  is  shown  in  Figure  13-7. 


d  q  ^ 


|  R  jd*u  code  jj 

-■ 

1 _ r _  _ 1  -d*  tt _ rsiidiM _ 1 

J 

; - 3 

k  T  d*v 

I 

q  [d*q  code  | 

I 

dq  |d*q  residue  | 

times  2  4  dav 

| 

dq  dv  daq  J  | 

T 

| 

dv  |d*v  residue  | 

_ 

1  v  |d*v  code  1 

Figure  13  -7.  Elementary  QDD^A 


XU1-23 


Here,  dau  and  dav  are  inputs,  and  the  function  of  the  unit  is  to 
produce  the  second  differential  of  u/v.  The  unit  forms  d  (qv-u 
in  r  and  {qv-u)  in  R,  where  q  is  the  machine  value  of  the 
quotient.  The  overflow,  dsq,  is  generated  with  respect  to 
+v/ 2,  as  in  the  last  section,  although  v  is  now  a  variable;  in 
particular  d8q  =  sgnR  sgnv  U  (2/R/  -  v  /),  where  U  is  the 
unit  step  function.  The  output  is  fed  back  negatively  to  alter 
d'j  as  required  by  qv-u,  which  reflects  the  error  in  q  in 
representing  the  true  quotient. 

A  QDD*A  is  a  machine  generalised  from  the  elementary  QDD8  A 
unit.  By  coding,  such  a  unit  can  be  used  for  product  gen¬ 
eration  or  scaled  integration,  as  well  as  quotient  generation. 

A  hardware  estimate  is  given  in  the  table  below. 

TABLE  13-5.  HARDWARE  ESTIMATE 


Unit 

Section 

Diodes 

Flip  /Flops 

Channels 

DD8A 

2 

1200 

29 

6 

PDDaA 

3.  2 

1350 

34 

7 

QDD8  A* 

3.  3 

1600 

40 

8 

D. 


Mechanization  and  Functional  Features  of  the  Proposed  and 
Alternate  QDD®A  Computers. 


Register  Configuration 


Registers  present 
only  when  the  D 
multiplier  is  used 


Three  addi- 
tional  ft  - 
registers  re¬ 
quired  if 
core 

communi¬ 
cation  not 
provided 


X111-1S 


Processing  capability  (assuming  state-of-the-art  400,  000 
bit/sec)  of  Computer  No.  1:  Full  Aerospace  Mission  ODA 

Proposed: 

1.  A  multi -iteration  rate,  multi -increment  ODA 
for  full  aerospace  mission  which  is  capable  of 
parallel  computations  with  quotient  algorithm. 

2.  Simultaneous  thrust  cutoff  and  strap-down 
computations  at  1600  iter/sec  in  a  program  of 
256  DDA  integrators. 

3.  Single  precision  (3  bit  increment),  double 
precision  (6  bit)  programmable. 

Computer  No.  2:  Aerospace  DDA  (Reduced  Task,  i.  e.  ,  No 
High  Rate  Input  Processing) 

Proposed: 

1.  Where  the  DDA  is  not  allocated  thrust  cutoff 
and  strap-down  computations  an  appropriate 
computer  executes  a  300  DDA  integrator  program 
at  213  iter/sec. 

2.  Quotient  algorithm  and  single  precision  (bit 
increment)  double  precision  (6  bit)  program¬ 
mable  using  Ds  multipliers  and  conventional 
multipliers. 


i 

o 

•«4 

I 

u 

V 


3 

4)  H 

J5  g 

it  5 

•  u 

v  — > 

9  ** 
£  c 
.5  «* 


W 

s 

£ 

« 


4) 

U 

J5 


■6*  « 
«  <* 

*< 

£  ■« 

S  c 

O  «g 

■o  « 

e  ■ 

«  8. 

a  5" 

,2  2 

co. 

a  « 

3  X 

c  - 


N 

O 

Z 

h 

V 

\ 

o 

U 


1 1 :  i 

u  Saf 


a* 


e 

o 


fr* 

♦  !M 


C 
3 

e  m 

3  *- 

* 


k< 

V 


S'®  !£.  I 


*o 

4) 

|J 

K. 

a. 


a 


o 

U 


a 

■ 

Xt 

<2 


« 

S  S 

u  .9 

■2  « 
o  ? 
*  2 
in  S 


3 

•3  "P 


a  g  •  e 

i :  i  I 

g  S  £  *> 

o  o  2  c 

<  ■  -  ■■  § 


U,  * 
\  " 
Ut 


e 

3 

e  im 
3 
<M 


O  ft) 


ki 

4) 


o  x 

fM  U 


4> 


s  -®  si 


-*  •  u  3  o  o  —S 


2*o 
1^2 
~  S  -o  J5  g 

g  2  e  «  2 

£  £  *•  c  2 

g  g  e  g  t* 

.9  o  ©  x  e 

U  u  3  u  o 


fc  2 
2 

“*  3 

Sr 


O  « 


4) 


S5 
£  2 


«o  £ 

u 


S  S 

m  JS 

(M  U 


>  s 


<M  O 


« 


h 

« 


e 


m  ■«• 


a 

c  2. 

Z  G 


=  i 

*2  0 

\  C  o 

Q*  *  • 

«  •  TJ 

a  *. 
«M  o  O 
r*  C  £ 


m 


2 

*  HJn  «  oi  m 


*•«  a 

♦  £ 

o-  C 


S  « 

h 

v  3  O 

^  S  w 

s  • 

H 

^  c  » 


3 

£ 


-  i! 


§  • 

3  « 

2  g 

?  J3 


•• 

4 


3  ~ 


«  * 

*•  e 

o 

~  6  • 
«  a  • 
« 


»• 

« 


«  « 

|  .  U  *E 

3  *.  3  o 

3  -g 

I  5  38  8 

*0  w 


a  r. 


O  * 

u  £ 


■o 

e 

<a 


® 

> 


« 

« 

H 


io  “ 

’•! 

151 

: .  e 

’:i 

rw  „ 

m  6 

s  ••* 


M 

O 


1  S 

l 


PC 


00  •  1 


u 

« 

£ 

2 

eo 

8 

0. 


3 

O 

£ 

§ 


c 

is- 

u  C 

•S  K 
1? 
s .. 


X  Ill*  27 


CHAPTER  XIV 


THE  FUTURE  ROLE  OF  THE  INCREMENTAL  COMPUTER  IN  FULL  SCALE 
AEROSPACE  COMPUTER  SYSTEMS  AND  PROPOSED  STUDY  EFFORTS 

I  4.  0  LONG-TERM  DESIGN  GOALS  AND  REMAINING  PROBLEMS  IN 
THEIR  FULL  ACHIEVEMENT  -  This  contract  study  (see  Chapter  XV) 
accomplishes  the  development  of  aerospace  incremental  computer  design 
techniques  (exemplified  in  the  QDD*  A)  which  enable  real  time  computation 
of  largo  programs  by  a  computer  mechanization  of  assigned  complexity  at 
new  levels  of  accuracy  for  variables  having  the  degree  of  continuity  ordi¬ 
narily  assigned  to  or  considered  feasible  for  DDA  type  co:uputers.  The 
ordinary  concept  of  a  GP-DDA  system  with  a  relatively  complex  costly  GP 
being  required  for  a  full  aerospace  mission  actually  resides  in  the  fact 
that  certain  real  time  computations  involve  system  variables  which  are 
only  piece-wise  continuous  (apart  from  communication  link  data  inputs 
which  are  assumed  specially  provided  for).  There  had  been  a  prevailing 
cmire  P»  of  the  inability  of  conventional  incremental  computer  design,  and 
programming  teehuiquesto  handle  all  the  problematic  routines  involving 
variables  with  step  ihungcg  and  singularities  implying  only  limited  use  nl 
the  DDA.  In  iontraxtlit  is  proposed  here  that  the  development  of  the  appro¬ 
priate  incremental  compute r  design,  and  programming  techniques  can 
essentially  eliminate  the  GP,  as  such,  in  that  a  GP  of  the  cost  level 
ordinarily  assumed  in  full  mission  aerospace  applications  is  not  required. 

A  conviction  that  the  DDA  techniques  can  be  accomplished  in  an  efficient 
mechanization  implies  that  a  significant  increase  in  computation  capability 
for  given  mechanization  complexity  will  result  in  the  overall  computer 
system.  An  intermediate  accomplishment  would  be  the  development  of  a 
GP-DDA  system  in  which  the  GP  is  highly  simplified  in  mechanisation  and 
with  primarily  low-rate  supervisory  capability  over  the  DDA.  T'  e  latter 
actually  executing  >95%  to  >99%  of  the  computations( instead  of  15%  to  30%). 


XIV- 1 


As  shown  in  paragraph  11. 4D,  an  incremental  computer  can  have  a  speed 
advantage  over  a  conventional  GP  of  6  to  1  for  the  same  bit  rates,hence, 
apart  from  ability  to  handle  variables  with  discontinuities,  the  GP  is  not 
only  basically  more  costly,  but  also,  slower  than  the  incremental  com¬ 
puter.  The  degree,  to  which  the  GP  can  be  simplified  while  retaining 
ability  to  handle  variables  with  discontinuities,  is  therefore  the  prime 
question.  That  100%  handling  of  the  piece-wise  continuous  variables, 
which  make  up  aerospace  guidance  and  control  computations,  is  possible  in 
a  hybridised  incremental  computer  is  given  support  by  the  analyses  of 
XI  B  4b,  C8,  C6.  The  first  two  references  present  new  techniques  and 
approaches  to  technique  development  for  handling  this  problem  while  the 
last  two  indicate  basic  application  of  conventional  decision  action  efficiently 
generalized  to  the  QDlf  A. 

The  problematic  computations  involving  isolated  singularities  are  those 
where  high  accuracy  must  be  maintained  over  long  term  operation.  These 
computations  typically  involve  singularities  resulting  from  properties  of 
coordinate  systems.  In  principle  these  computations  are  resolved  directly 
by  better  selection  of  a  coordinate  system.  There  are  cases  where  this 
is  not  permitted  in  the  fullest  sense,  for  example:  geographic  coordinates, 
presenting  discontinuity  problems,  are  designated  for  display  purposes 
and  are  necessary  for  gravity  computation.  An  incremental  computer  must 
therefore  be  able  to  handle  the  geographic  coordinate  problem,  when  co¬ 
ordinates  differ  significantly  from  the  well  defined  coordinates  of  singu¬ 
larities  that  required  accuracy  is  maintained.  It  is  believed  that,provided 
this  particular  example  problem  can  be  overcome  in  a  proposed  study 
effort,  then  the  solution  of  any  other  problem  is  relatively  straightforward. 

The  second  class  of  design  problems  proposed  for  further  study  are  those 
of  further  unification  by  system  and  logical  design  analysis  and  ODA  simu¬ 
lation  of  the  man>  computation  algorithm  and  digital  processing  discoveries 


XIV -2 


made  during  this  contract  effort.  While  quantitative  computation  analyses 
and  simulations  demonstrate  that  the  QDD3 A  is  a  remarkable  step  in  com¬ 
puter  design,  it  is  believed  the  refinement  of  the  developed  design  tech¬ 
niques,  incorporated  in  the  QDD^A,  can  offer  significant  increases  in 
computation  capability  and  reduction  in  mechanization  complexity. 

14. 1  BRIEF  SUMMARY  OF  PROPOSED  STUDY  EFFORTS 

A.  Completion  of  Simulation  Evaluation  and  Analysis  of  All 

Developments  of  Phase  II  and  Proposed  Analytical  Efforts 

1.  Evaluation  and  comparison  of  alternative  QDD*A  algorithms 

2.  Evaluation  and  comparison  of  alternative  DD*  A  algorithm 

3.  Digital  Stieltjes  algorithm  for  near  conventional  single 
increment  DDA  and  generalization  to  multi-increment 
computers 

4.  Overflow  inhibitor  and  pulse  stream  transducer 

5.  Evaluation  of  singularity,  discontinuity  pass  programs  of 
adapted  mechanizations 

B.  Logical  Design  Investigations 

1 .  Optimized  communication  mechanization  for  large  problem 
incremental  computers  and  GP-DDA  with  atrophied  GP 

2.  Extension  of  multi -increment  arithmetic  unit  design  studies 
for  band  limited  variables 

3.  Programmable  single,  double  precision  modes  involving 
the  O1  multiplier 

4.  System  evaluation  and  optimisation  by  execution  of  modal 
mechanization,  register,  and  arithmetic  unit  costs  rela¬ 
tions  analysis  for  minimal  hardware  count  at  assigned 
computation  capability. 


XIV- 3 


Analytical  Investigation  of  Improved  Processing  Complementation 
Structure  for  GP  and  DDA  in  GP-DDA  Computer  System  with 
atrophied  GP 

1.  Immediate  goal  is  to  make  possible  inherent  but  not  fully 
attained  computation  rate  superiority  of  DDA  over  GP 
(which  is  4  or  6  to  1)  in>95  percent  of  aerospace  programs 
without  excessive  supervision  of  DOA  by  GP 

2.  Ultimate  goal  of  greatly  reducing  or  essentially  eliminating 
the  major  GP  hardware  cost 

3.  Refine  the  quantitative  computation  capability  formulations 
of  Phase  11  in  light  of  advances  in  digital  Stieltjes  integration 

Further  Analysis  of  the  Pulse  Stream  Analog  to  Digital  Con¬ 
verter  and  Overflow  Inhibitor  for  Improved  Computation 
Accuracy  at  Low  Rate  Phases  of  Inputs. 


CHAPTER  XV 


BRIEF  SUMMARY  OF  ACCOMPLISHMENTS  OF  THE 
HSODA  STUDY  EFFORT 

15.0  DEVELOPMENT  OF  FULL  SCALE  INCREMENTAL  COMPUTER 
SYSTEM  (QDD3  A)  FOR  FULL  AEROSPACE  MISSION 

A.  Total  program  includes:  programmable  input  processing  (for 
simultaneous  thrust  cutoff  and  strap-down  navigation)  with  multi- 
increment  accuracy  (assuming  modest  clock  rates)  at  1600 
iterations/sec.  and  internal  computations  at  100  its  rations/ sec 
for  a  256  DDA  integrator  program. 

B.  Communication  hardware  simpler  than  a  conventional  single 
increment  DDA  of  same  capacity  (an  invention). 

C.  Computation  with  programmable  single  precision  (3  bit)  and 
double  precision  (6  bit)  transfer  action  (an  invention). 

D.  Multi -increment  quotient  algorithm  (an  invention)  for  orbital 
and  re-entry  computations. 

E.  The  total  program  exceeds  the  computation  capacity  of  four 

3  bit  increment  DDA  computers  and  executes  6  bit  increment 
computation  programmably  for  error  sensitive  routines. 

F.  The  total  program,  combined  with  a  slow  multiplier  general 
purpose  computer,  provides  simpler  system  mechanisation 
than  existing  aerospace  computers  which  generally  have  one- 
half  or  less  the  computation  capacity.  A  proposed  QDlf  A 
mechanisation  has  a  12  word  core  memory  for  communication 
and  input  absorption  and  79  flip-flops  (30  percent  less  than  the 
strap-down  processor  section  constructed). 


XV-I 


The  advantage  over  existing  CP-DDA  systems  can  be  further  increased 
by  proposed  further  studies  directed  toward  utilizing  the  inherent  but  un¬ 
realized  computation  superiority  of  DDA  over  GP  in  90  percent  of  aero¬ 
space  program  routines. 

15.  1  DEVELOPMENT  OF  CONCEPTS  FOR  DESIGN  OF  MULTI -INCRE¬ 
MENT  COMPUTER  WITH  SIMPLIFIED  MECHANIZATION  -  An  epochal 
breakthrough  in  multi-increment  computer  design  has  been  made  for  the 
incremental  computation  of  the  band  limited  variables  which  characterise 
typical  DDA  computations.  The  DDA  integrators  (or  generalised  integrators) 
have  outputs  which  represent  second  differentials  rather  than  first  differ¬ 
entials.  The  communication  of  second  differentials  is  mechanised  in  the 
manner  of  first  differentials  in  a  conventional  DDA.  Developments  during 
Phase  II  which  have  exploited  the  new  concepts  are: 

A.  Single  increment  communication  for  a  multi-increment  compu¬ 
ter  attaining  a  new  level  of  communication  mechanisation  simplicity. 

B.  Quotient  algorithm  computation  with  multi-increment  accuracy 
not  approximated  by  any  previously  existing  DDA. 

C.  Simplified  arithmetic  unit  design  for  multi-increment  computation. 

D.  Second  order  integration  algorithm  realisation  in  simplified  mech¬ 
anisation. 

15.  2  DEVELOPMENT  OF  QUOTIENT  ALGORITHM  FOR  MULTI¬ 
INCREMENT  COMPUTATION  •>  All  previous  quotient  algorithm  computers 
have  been  limited  to  basically  single  increment  accurac)  although  this 
increment  might  have  a  variable  scale.  The  technical  design  problems 
which  have  inhibited  the  development  of  a  multi-increment  algorithm 
have  been  overcome.  The  newly  developed  algorithm  has  mechanisation 
cost  comparable  to  that  of  variable  increment  algorithm  which  is  less 
accurate. 


XV -2 


15.3  DEVELOPMENT  OF  A  NEW  MULTI -TRANSFER  UNIT  (THE  Da 
MULTIPLIER)  WITH  SIMPLIFIED  MECHANIZATION  (  FOR  COMPUTERS 
WITH  SECOND  DIFFERENCE  COMMUNICATION  OF  BAND  LIMITED 
VARIABLES) 

A.  A  breakthrough  in  incremental  arithmetic  unit  capability  for 
given  complexity  has  been  made.  Previously  presumed  inherent 
complexity  levels  have  been  lowered  for  the  important  class  of 
band  limited  variables  typically  involved  In  internal  computations 
or  programmed  on  simulation  incremental  computers. 

B.  The  if  multiplier  unit  can  perform  product  calculation  with 
higher  accuracy  than  two  conventional  multiplier  units  for  3  bit 
transfer  but  the  new  unit  costs  the  same  as  a  single  one.  The 
if  multiplier  unit  can  perform  many  bit  increment  computations 
depending  on  the  scaling  properties  of  the  variables;  there  are 
example  calculations  in  which  the  simple  unit  can  exceed  speed 
and  accuracy  of  a  high  performance  general  purpose  computer. 

15.4  A  BREAKTHROUGH  IN  ACCURACY  IN  CONVENTIONAL  TYPE 
SINGLE  INCREMENT  DDA  BY  MODIFICATION  TO  EXECUTE  A  NEW 
DIGITAL  ST1ELTJES  INTEGRATION  ALGORITHM  CONTAINS: 

A.  Integration  with  respect  to  independent  variables  other  than 
time  (Stlcltjes  integration)  constitutes  >75%  of  all  DDA  Compu¬ 
tations  including  division,  reciprocal,  product,  input  processing. 

B.  Incorrect  digital  Sticltjes  integration  algorithm  has  been  deter¬ 
mined  the  m«Jor  error  source  in  single  increment  DDA  in  these 
computations,  including  reciprocal  calculation, 

C.  By  modest  elaboration  in  mechanisation  of  the  conventional 
DDA  new  accuracy  levels  are  attainable. 


XV- 1 


D.  Doppler  damped  inertial  navigation  by  near  conventional  DDA 
ie  for  the  firat  time  attainble. 

15.  5  PRELIMINARY  DEVELOPMENT  OF  INCREMENTAL  COMPUTERS 
OF  INTERMEDIATE  COMPLEXITY  FOR  SPECIAL  APPLICATIONS  WHERE 
NO  INPUT  PROCESSING  IS  REQUIRED. 

A.  Certain  airborne  and  aerospace  applications  require  relatively 
high  computation  capability  but  do  not  require  input  processing 
in  the  DDA  though  perhaps  in  the  GP  of  a  GP-DDA  system.  In 
these  applications  a  simpler  mechanisation  than  the  full  scale 
QDD*A  is  feasible. 

B.  The  design  combination  of  if  multipliers  and  quotient  algorithm 
(the  latter  with  programmable  multi -and  single -transfer)  pro¬ 
vides  extraordinary  computation  features: 

1.  Two  jf  multiplier  units  (costing  the  same  as  a  3  bit  transfer 
unit)  can  in  many  calculation  routines  do  the  work*  of  six 
DDA  computers  each  with  3  bit  (or  in  certain  cases  more) 
increment  accuracies. 

2.  Two  I f  multipliers  and  2  single  transfer  units  in  other 
prevalent  computations  routines  can  provide,  in  parallel 
computation,  the  work  of  four  DDA  computers. 


♦The  quotient  algorithm  has  utility  for  whole  word  scaling  as  well  as 
division,  in  which  case  effective  performances  stated  in  (1)  may  be 
reduced  to  that  of  three  DDA  computers. 


XV -4 


15.6  PRELIMINARY  THEORY  OF  PULSE  STREAM  ANALOGUE  TO 
DIGITAL  CONVERTER  ERROR  STRUCTURE  AND  DIGITAL  STIELTJES 
INTEGRATION  FOR  HIGH  RATE  INPUT  PROCESSING  BY  A  SINGLE 
INCREMENT  DDA 

A.  Preliminary  analysis  and  simulation  results  for  roundoff 
reduction  processes  of  overflow  inhibition  and  digital 
Stieltjes  integration  appear  applicable  to  the  pulse  stream 
transducer  as  well  as  single  increment  DDA  systems. 

B.  Input  processing  algorithm  based  on  these  results  for  single 
increment  input  processings  can  provide  improved  performance 
with  modest  hardware  modifications. 


XV -5 


T“  5  ^ 

Q  |  iSf  X 

a  i  §  1 1  r 

S3  2  h  i.  S  °  ??  « 


7  .§ 
gl  .  SIS | 

las*1':®? 

w  5  /!  2  °  * 


•— (  wIhuvO  /**ri  <  u*6wo0C 

!2SuSi!o2K«  iouSz  c 
2  -  5  3  a~  a«  ®  u  ifli  .BU«e 

►4  Ua-»-«  p4  o  U  w  **  ri  pd  o  ^  l/) 

u  gs£°'iw«e5'2ua®a.-< 
Z  o=>g‘!-fa*o,.3,3X-'joc 

du2°<Q<hu  j  x«z5 


>  >  -  a 
-  >  > 


fa  *  . 
<  ?  «T* 

at 

c  o  a  - 

•2  J!  ®  ■ 

«u  vC 

Z  if  u 
Q  O  a 

2  s  * 


3|- 

O  S  X 

ti.Sf 

<  jS 


q55 

<jS 

*g? 

aZ  « 
u  £ 


,  M  I  z  '  « 

*ls**!*S 

8  8  “.sr;3  g,aE 

2*  >  a  C  .£  x  o  £  g 

Oa  -  s  «  ;  j,  i)  «a 

!§  i  £2  g-u-S  1  a 

X  j  >•  -g  g  h  u  S' 

a  «u«“OoeE 
<  jj  £  ,  a  -  o 

■2  ,wSc“K- 

u  s  o  x  o  a  ** 

•5  “a*??  .  3-2 

®  xa  a  a  s  at.? 

a  a  v  c  -o  S  *• 
•SaiihSE  u 
aasNSPC 

±  S.2fTl  a  2  S" 

^  g  e  o,£  » 

■8---g5.il’;  8 

3  2|i§i 2| 


— !T  •  3  a  E  § 

Q  |  a  *j  »  Sf  .*-8  22  2 

W  p  5  3,  u  g  a  »'  ji  n  o  3 

r  c  *  e  7  u.  7  «  o 


«  •  1 
o  c  X 

2  2  «i 

3  a  .  u.  “■ 

>  S  °  ;  o 

<5§a» 

^  ►  °  j  -* 

<  “  c <  o 

a  H  o. 
c  O  4-5  r 

.2  S  «S2* 

■«  h  S  ,  .5  • 

®u£8*« 

C  C  *  7  N 

c  0  -a  o£  , 

£  *3  <x  u  * 

*  U  C  y  N"  a* 

►a  §gxs 

tfl  r  agii 
- fa  iHJj 
a  .1 iji 
iiSQZ3 

!  i  *5  s  z 

*izHz 


-  "5 

J  4  W 

S  ®  - 
1 

X  —  e 

Tl? 

2  -  i 

-  •>* 
•  7  w 

cf° 

D  * 

^  JZ  4 

?-3 


^>5 


U  I  JK  ii 
'2  8|  .-|^ 
s fa*  c T£ 

1  '2  4  ««  Oa—*  — 

!i  s55-§ 

:  w  «  a.  *  ••  2 

I  5  4  >  i  a 

«  a.  w  *  '9  " 

'  X  3  3 

•  «a  fc  —  aa»  A, 

iiw|Ss! 

?3|  .13 

t  ? « i  s  -  i 

;  3“«at 

■  a  m  i  9  ii 
>  a  S  “  •o  £  *■ 

1  «  N  5  X  w 
■  a  _  a  ,  c  c 

;  S  «  5  c  2  » 
ef  a  5  cw  t 

ls*H* 

lilTi  si 


g «  , 

•2  J*  *■  a 

i  °q 

>  3  C 

.5  Q.<  2,  o 


c  -  C  a  .3 

O  ®  2 

•2  a  Oj, 

U%te 

is  8  8| 

2  S.s^« 

3  4  4  Z 

E  J  o  1 1 

§s«2; 

jlrili 

I  -  s  -  e  .sr 

S  fcRs-S  » 

£®J  3  S  a 


§  £  • 

a  a  S  A 

|i;|g 

-?  §a1  £ 
s»"S; 


3  7  m  m 

ie\z$ 

iS2>| 
His-»  a 

etog§ 

A  C  *  " 


UNCLASSIFIED  UNCLASSIFIED 


